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Preface 


Risk management is an important subject in finance. Despite its popularity, 
risk management has a broad and diverse definition that varies from individ- 
ual to individual. One fact remains, however. Every modern risk management 
method comprises a significant amount of computations. To assess the suc- 
cess of a risk management procedure, one has to rely heavily on simulation 
methods. A typical example is the pricing and hedging of exotic options in 
the derivative market. These over-the-counter options experience very thin 
trading volume and yet their nonlinear features forbid the use of analytical 
techniques. As a result, one has to rely upon simulations in order to examine 
their properties. It is therefore not surprising that simulation has become an 
indispensable tool in the financial and risk management industry today. 

Although simulation as a subject has a long history by itself, the same 
cannot be said about risk management. To fully appreciate the power and 
usefulness of risk management, one has to acquire a considerable amount of 
background knowledge across several disciplines: finance, statistics, math- 
ematics, and computer science. It is the synergy of various concepts across 
these different fields that marks the success of modern risk management. Even 
though many excellent books have been written on the subject of simulation, 
none has been written from a risk management perspective. It is therefore 
timely and important to have a text that readily introduces the modern tech- 
niques of simulation and risk management to the financial world. 

This text aims at introducing simulation techniques for practitioners in the 
financial and risk management industry at an intermediate level. The only 
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prerequisite is a standard undergraduate course in probability at the level of 
Hogg and Tanis (2006), say, and some rudimentary exposure to finance. The 
present volume stems from a set of lecture notes used at the Chinese University 
of Hong Kong. It aims at striking a balance between theory and applications 
of risk management and simulations, particularly along the financial sector. 
The book comprises three parts. 


e Part one consists of the first three chapters. After introducing the moti- 
vations of simulation in Chapter 1, basic ideas of Wiener processes and 
It6’s calculus are introduced in chapters 2 and 3. The reason for this 
inclusion is that many students have experienced difficulties in this area 
because they lack the understanding of the theoretical underpinnings of 
these topics. We try to introduce these topics at an operational level 
so that readers can immediately appreciate the complexity and impor- 
tance of stochastic calculus and its relationship with simulations. This 
will pave the way for a smooth transition to option pricing and Greeks 
in later chapters. For readers familiar with these topics, this part can 
be used as a review. 


e Chapters 4 to 6 comprise the second part of the book. This part con- 
stitutes the main core of an introductory course in risk management. 
It covers standard topics in a traditional course in simulation, but at a 
much higher and succinct level. Technical details are left in the refer- 
ences, but important ideas are explained in a conceptual manner. Ex- 
amples are also given throughout to illustrate the use of these techniques 
in risk management. By introducing simulations this way, both students 
with strong theoretical background and students with strong practical 
motivations get excited about the subject early on. 


e The remaining chapters 7 to 10 constitute part three of the book. Here, 
more advanced and exotic topics of simulations in financial engineering 
and risk management are introduced. One distinctive feature in these 
chapters is the inclusion of case studies. Many of these cases have strong 
practical bearings such as pricing of exotic options, simulations of Greeks 
in hedging, and the use of Bayesian ideas to assess the impact of jumps. 
By means of these examples, it is hoped that readers can acquire a first- 
hand knowledge about the importance of simulations and apply them 
to their work. 


Throughout the book, examples from finance and risk management have 
been incorporated as much as possible. This is done throughout the text, 
starting at the early chapter that discusses VaR. of Dow to pricing of basket 
options in a multi-asset setting. Almost all of the examples and cases are 
illustrated with Splus and some with Visual Basics. Readers would be able 
to reproduce the analysis and learn about either Splus or Visual Basics by 
replicating some of the empirical work. 


PREFACE Xvii 


Many recent developments in both simulations and risk management, such 
as Gibbs sampling, the use of heavy-tailed distributions in VaR calculation, 
and principal components in multi-asset settings are discussed and illustrated 
in detail. Although many of these developments have found applications in the 
academic literature, they are less understood among practitioners. Inclusion 
of these topics narrows the gap between academic developments and practical 
applications. 

In summary, this text fills a vacuum in the market of simulations and risk 
management. By giving both conceptual and practical illustrations, this text 
not only provides an efficient vehicle for practitioners to apply simulation tech- 
niques, but also demonstrates a synergy of these techniques. The examples 
and discussions in later chapters make recent developments in simulations and 
risk management more accessible to a larger audience. 

Several versions of these lecture notes have been used in a simulation course 
given at the Chinese University of Hong Kong. We are grateful for many 
suggestions, comments, and questions from both students and colleagues. In 
particular, the first author is indebted to Professor John Lehoczky at Carnegie 
Mellon University, from whom he learned the essence of simulations in compu- 
tational finance. Part two of this book reflects many of the ideas of John and 
is a reminiscence of his lecture notes at Carnegie Mellon. We would also like 
to thank Yu-Fung Lam and Ka-Yung Lau for their help in carrying out some 
of the computational tasks in the examples and for producing the figures in 
LaTeX, and to Mr. Steve Quigley and Ms. Susanne Steitz, both from Wiley, 
for their patience and professional assistance in guiding the preparation and 
production of this book. Financial support from the Research Grant Council 
of Hong Kong throughout this project is gratefully acknowledged. Last, but 
not least, we would like to thank our families for their understanding and 
encouragement in writing this book. Any remaining errors are, of course, our 
sole responsibility. 


NGAI HANG CHAN AND HOI YING WONG 


Shatin, Hong Kong 
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Introduction 


1.1 QUESTIONS 


In this introductory chapter, we are faced with three basic questions: 
e What is simulation? 
e Why does one need to learn simulation? 


e What has simulation to do with risk management and, in particular, 
financial risk management? 


1.2) SIMULATION 


When faced with uncertainties, one tries to build a probability model. In 
other words, risks and uncertainties can be handled (managed) by means of 
stochastic models. But in real life, building a full-blown stochastic model to 
account for every possible uncertainty is futile. One needs to compromise 
between choosing a model that is a realistic replica of the actual situation 
and choosing one whose mathematical (statistical) analysis is tractable. 

But even equipped with the best insight and powerful mathematical knowl- 
edge, solving a model analytically is an exception rather than a rule. In most 
situations, one relies on an approximated model and learns about this model 
with approximated solutions. It is in this context that simulation comes into 
the picture. Loosely speaking, one can think of simulations as computer ex- 
periments. It plays the role of the experimental part in physics. When one 
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studies a physical phenomenon, one relies on physical theories and experimen- 
tal verifications. When one tries to model a random phenomenon, one relies 
on building an approximated model (or an idealized model) and simulations 
(computer experiments). 

Through simulations, one learns about different characteristics of the model, 
behaviors of the phenomenon, and features of the approximated solutions. Ul- 
timately, simulations offer practitioners the ability to replicate the underlying 
scenario via computer experiments. It helps us to visualize the model, to 
study the model, and to improve the model. 

In this book, we will learn some of the features of simulations. We will see 
that simulation is a powerful tool for analyzing complex situations. We will 
also study different techniques in simulations and their applications in risk 
management. 


1.3. EXAMPLES 


Practical implementation of risk management methods usually requires sub- 
stantial computations. The computational requirement comes from calculat- 
ing summaries, such as value-at-risk, hedging ratio, market 3, and so on. In 
other words, summarizing data in complex situations is a routine job for a 
risk manager, but the same can be said for a statistician. Therefore, many of 
the simulation techniques developed by statisticians for summarizing data are 
equally applicable in the risk management context. In this section, we shall 
study some typical examples. 


1.3.1 Quadrature 


Numerical integration, also known as quadrature, is probably one of the ear- 
liest techniques that requires simulation. Consider a one-dimensional integral 


b 
I= fa)ade, (1.4) 
a 
where f is a given function. Quadrature approximates J by calculating f at 
a number of points 71, 2%2,... ,@, and applying some formula to the resulting 
values f(x1),...,f(an). The simplest form is a weighted average 


re > wi f (zi), 
i=1 


where w1,...,Wn are some given weights. Different quadrature rules are 
distinguished by using different sets of design points x1,... , 2%, and different 
sets of weights wi1,...,Wn. As an example, the simplest quadrature rule 
divides the interval [a, b| into n equal parts, evaluates f(x) at the midpoint of 
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each subinterval, and then applies equal weights. In this case 
a. See ; 
aay S> f(a + (24 — 1)(b—a)/(2n)). 
i=l 


This rule approximates the integral by the sum of the area of rectangles with 
base (b — a)/n and height equal to the value of f(z) at the midpoint of 
the base. For n large, we have a sum of many tiny rectangles whose area 
closely approximates J in exactly the same way that integrals are introduced 
in elementary calculus. 

Why do we care about evaluating (1.1)? For one, we may want to calcu- 
late the expected value of a random quantity X with probability distribution 
function (p.d-f.) f(x). In this case, we calculate 


B(X) = f af(z)az, 


and quadrature techniques may become handy if this integral cannot be solved 
analytically. Improvements over the simple quadrature have been developed, 
for example, Simpson’s rule and the Gaussian rule. We will not pursue the 
details here, but interested readers may consult Conte and de Boor (1980). 
Clearly, generalizing this idea to higher dimensions is highly nontrivial. Many 
of the numerical integration techniques break down for evaluating high di- 
mensional integrals. (Why?) 


1.3.2 Monte Carlo 


Monte Carlo integration is a different approach to evaluating an integral of f. 
It evaluates f(x) at random points. Suppose that a series of points £1,... , fn 
are drawn independently from the distribution with density g(x). Now 


r= f sede = f (se)fole)ia(a)ae = BF), (1.2) 


where E, denotes expectation with respect to the distribution g. Now, the 
sample of points 21,...,2, drawn independently from g gives a sample of 


values f(x;)/g(xi) of the function f(x)/g(x). We estimate the integral (1.2) 
by the sample mean 


According to classical statistics, I is an unbiased estimate of J with variance 


f(z) 
g(x)” 


Var(I) = “Varg 
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As n increases, J becomes a more and more accurate estimate of J. The 
variance (verify) can be estimated by its sample version, viz., 


n 2(z, 72 
hy ee (1.3) 


Besides the Monte Carlo method, we should also mention that the idea 
of the quasi-Monte Carlo method has also enjoyed considerable attention re- 
cently. Further discussions on this method are beyond the scope of this book. 
Interested readers may consult the survey article by Hickernell, Lemieux, and 
Owen(2005). 


1.4 STOCHASTIC SIMULATIONS 


In risk management, one often encounters stochastic processes like Brown- 
ian motions, geometric Brownian motion, and lognormal distributions. While 
some of these entities may be understood analytically, quantities derived from 
them are often less tractable. For example, how can one evaluate integrals 
like uF W(t) dW(t) numerically? More importantly, can we use simulation 
techniques to help us understand features and behaviors of geometric Brown- 
ian motions or lognormal distributions? To illustrate the idea, we begin with 
the lognormal! distribution. 

Since the lognormal distribution plays such an important role in modeling 
the stock returns, we discuss some properties of the lognormal distribution 
in this section. First, recall that if X ~ N(y,07), then the random variable 
Y = eX is lognormally distributed, i.e., log Y = X is normally distributed 
with mean p and variance a”. Thus, the distribution of Y is given by 


Gly) = PY <y)=P(X < logy) 
P(X — p)/o < (logy — »)/c) 

= O((logy — p)/o), 
where ®(-) denotes the distribution function of a standard normal random 
variable. Differentiating G(y) with respect to y gives rise to the p.d.f of Y. 
To calculate EY, we can integrate it directly with respect to the p.d.f. of Y 


or we can make use of the normal distribution properties of X. Recall that 
the moment generating function of X is given by 


Mx(t) = E(et®) = eta" 
Thus, 
EY = E(e*) = Mx(1) = e#*2”. 


By a similar argument, we can calculate the second moment of Y and deduce 
that oe 
Var(Y) = e?#4° (e7 — 1). 


Density Value 
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To generate 1,000 lognormal random variables in SpLus with 4 = 0 and 
o2 =1, ie, EY = e®5 and Var(Y) = e(e — 1), type 


>x_c(0.001,0.9) 

>bounds_range(qlnorm(x) ,qnorm(x)) 

>points .x_seq(bounds [1] , bounds [2] , length=1000) 
>points.qlnorm_dlnorm(points.x) 

>points.qnorm_dnorm(points.x) 

> plot(0,0,type=’n’ ,xlim=bounds, ylim=range(c(points.qlnorm,points.qnorm)), 
+ xlab=’’, ylab=’Density Value’) 

> lines(points.x,points.qlnorm,col=1,1ty=1) 

> lines(points.x,points.qnorm,col=1,1ty=3) 


Fig. 1.1 Densities of a lognormal distribution with mean e°5 and variance e(e — 1), 
ie., 4 = 0 and o? = 1 and a standard normal distribution. 


It can be seen from Fig. 1.1 that a lognormal density can never be negative. 
Further, it is skewed to the right and it has a much thicker tail than a normal 
random variable. Note that we have not tried to introduce SPLus in detail 
here. We will only provide an operational discussion for readers to follow. For 
a comprehensive introduction to SPLUS, see Venables and Ripley (2002). 
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For readers who prefer to use Visual Basic, the corresponding codes are 
listed as follows: 


Sub LogNormDist () 


Dim x(2) As Double 
x(1) = 0.0014 
x(2) = 0.9 


Dim qlnorm(2) As Double 

Dim gnorm(2) As Double 

qlnorm(1) = Application.WorksheetFunction.LogInv(x(1), 0, 1) 
qinorm(2) = Application.WorksheetFunction.LogInv(x(2), 0, 1) 
qnorm(1) = Application.WorksheetFunction.NormSInv(x(1)) 
qnorm(2) = Application.WorksheetFunction.NormSInv(x(2)) 


Dim bounds(2) As Double 

Dim range As Double 

bounds(1) = Application.WorksheetFunction.Min(qlnorm, qnorm) 
bounds(2) = Application.WorksheetFunction.Max(qlnorm, qnorm) 
range = bounds(2) - bounds(1) 


Dim points_x() As Double 

Dim n As Integer 

n = 1000 

ReDim points_x(n) 

points_x(1) = bounds(1) 

Cells(2, 1) = points_x(1) 

Dim i As Integer 

For i=iTon-1 
points_x(i + 1) = points_x(i) + range / (mn - 1) 
Cells(i + 2, 1) = points_x(i + 1) 

Next i 


Dim points_qlnorm() As Double 
ReDim points_qlnorm(n) 

Dim points_qnorm() As Double 
ReDim points_qnorm(n) 

Dim a, b As Double 


For i= 1Ton 
If points_x(i) < 0 Then 
points_qlnorm(i) = 0 
Else 
a = Application.WorksheetFunction.LogNormDist( _ 
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points_x(i), 0, 1) 
b = Application.WorksheetFunction.LogNormDist( _ 
(points_x(i) + 0.00001), 0, 1) 
points_qlnorm(i) = (b - a) / 0.00001 
End If 
Cells(i + 1, 2) = points_qlnorm(i) 
a = Application.WorksheetFunction.NormSDist (points_x(i)) 
b = Application.WorksheetFunction.NormSDist (points_x(i) _ 
+ 0.00001) 
points_qnorm(i) = (b - a) / 0.00001 
Cells(i + 1, 3) points_qnorm(i) 
Next i 


Charts. Add 
ActiveChart.ChartType = x1XYScatterSmoothNoMarkers 
ActiveChart .SetSourceData Source:=Sheets("Sheeti") .range( _ 
"A2:C1001 "), PlotBy:=xlColumns 
ActiveChart.Location Where:=xlLocationAsNewSheet 
With ActiveChart 
-HasTitle = True 
.ChartTitle.Characters.Text = "" 
.Axes(xlCategory, xlPrimary) .HasTitle = False 
.Axes(xlValue, xlPrimary) .HasTitle = True 
.Axes(xlValue, xlPrimary) .AxisTitle.Characters.Text = _ 
"Density Value" 
End With 
With ActiveChart.Axes(xlCategory) 
-HasMajorGridlines = False 
.HasMinorGridlines = False 
End With 
With ActiveChart.Axes(xlValue) 
-HasMajorGridlines = False 
-HasMinorGridlines = False 
End With 
ActiveChart .HasLegend = False 
ActiveChart .Axes(xlValue) .Select 
With ActiveChart .Axes(xlValue) 
-MinimumScale = -0.01 
.MaximumScale = 0.7 
-MinorUnit = 0.2 
-MajorUnit = 0.2 
.Crosses = xlAutomatic 
.ReversePlotOrder = False 
.ScaleType = xlLinear 
-DisplayUnit = xlNone 


8 INTRODUCTION 


End With 
Selection.TickLabels.NumberFormat = "0.0" 
ActiveChart .Axes(xlCategory) .Select 
With Selection.Border 
-Weight = xlHairline 
.LineStyle = xlNone 
End With 
With Selection 
.MajorTickMark = xl0utside 
-MinorTickMark = xlNone 
.TickLabelPosition = xlNextToAxis 
End With 
With ActiveChart .Axes(xlCategory) 
.MinimumScaleIsAuto = True 
.MaximumScaleIsAuto = True 
-MinorUnit = 2 
-MajorUnit = 2 
.Crosses = xlCustom 
.CrossesAt = -4 
.ReversePlotOrder = False 
.ScaleType = xlLinear 
.DisplayUnit = xlNone 
End With 
Selection.TickLabels.NumberFormat = "0" 


{i 


End Sub 


Before ending this chapter, we would like to bring the readers’ attentions 
to some existing books written on this subject. In the statistical community, 
many excellent texts have been written on this subject of simulations, see, 
for example, Ross (2002) and the references therein. These texts mainly dis- 
cuss traditional simulation techniques without too much emphasis in finance 
and risk management. They are more suitable for a traditional audience in 
statistics. 

In finance, there are several closely related texts. A comprehensive treatise 
on simulations in finance is given in the book by Glasserman (2004). A more 
succinct treatise on simulations in finance is given by Jaeckel (2002). Both 
of these books assume a considerable amount. of financial background from 
the readers. They are intended for readers at a more advanced level. A book 
on simulation based on MATLAB is Brandimarte (2002). Another related 
book on Monte Carlo in finance is McLeish (2005). The survey article by 
Broadie and Glasserman (1998) offers a succinct account of the essence of 
simulations in finance. For readers interested in knowing more about the 
background of risk management, the two special volumes of Alexander (1998), 
the encyclopedic treatise of Crouchy, Galai and Mark (2001) and the special 
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volume of Dempster (2002) are excellent sources. The recent monograph of 
McNeil, Frey and Embrechts (2005) offers an up-to-date account on topics of 
quantitative risk management. 

The current text can be considered as a synergy between Ross (2002) and 
Galsserman (2004), but at an intermediate level. We hope that readers with 
some (but not highly technical) background in either statistics or finance can 
benefit from reading this book. 


1.5 EXERCISES 


1. Verify equation (1.3). 


2. Explain the possible difficulties in implementing quadrature methods to 
evaluate high dimensional numerical integrations. 


3. Using either SPLUS or Visual Basic, simulate 1,000 observations from a 
lognormal distribution with a mean e? and variance e4(e? — 1). Calcu- 
late the sample mean and sample variance for these observations and 
compare their values with the theoretical values. 


4. Let a stock have price S at time 0. At time 1, the stock price may 
rise to S, with probability p or fall to Sg with probability (1 — p). Let 
Rg = ($1 — S)/S denote the return of the stock at the end of period 1. 


(a) Calculate ms = E(Rs). 


(b) Calculate vg = \/ Var(Rs). 


(c) Let C be the price of a European call option of the stock at time 0 
and C; be the price of this option at time 1. Suppose that Cy) = C, 
when the stock price rises to S,, and C, = Cg when the stock price 
falls to Sg. Correspondingly, define the return of the call option at 
the end of period 1 as Rc = (Ci —C)/C. Calculate mc = E(Rc). 


(d) Show that vg = \/Var(Rc) = V/p(1 — p)(Cu — Ca)/C. 


{e) Let 2 = (Cues) / Buse) the so-called elasticity of the option. 
Show that vo = Quvg. 


Simulation Techniques in Financial Risk Management 
by Ngai Hang Chan and Hoi Ying Wong 
Copyright © 2006 John Wiley & Sons, Inc. 


Brownian Motions and 
Ito’s Rule 


2.1 INTRODUCTION 


In this chapter, we will learn about the notion of Brownian motion and geo- 
metric Brownian motion, the latter being one of the most popular models in 
financial theory. In addition, the issue of Ité’s calculus will also be introduced. 
The key element of this last concept is to develop an operational understanding 
of Itd’s calculus so that readers will be able to do simple stochastic integra- 
tion such as i W?(t)dW(t). Finally, we shall learn how to simulate these 
processes and study their corresponding features. 


2.2 WIENER’S AND !TO’S PROCESSES 


Consider the model defined by 
W (tr41) = W (tx) + €t,V At, (2.1) 


where tx41 — th = At, and k = 0,...,N with to = 0. In this equation, 
€z, ~ N(0,1) are identical independent distributed (i.i.d.) random variables. 
Further, assume that W(t) = 0. This is known as the random walk model 
(except for the factor V/At, this equation matches with the familiar random 
walk model introduced in elementary courses). Note that from this model, 
for 7 <k, 


k-1 
W (tk) — W(ts) = D> er, VAt. 
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There are a number of consequences: 


1. As the right-hand side is a sum of normal random variables, it means 
that W(t.) — W(t;) is also normally distributed. 


2. By taking expectations, we have 


E(W (tx) — W(t;)) =0, 


k-1 
Var(W (tk) — W(t;)) = E[D— ee, VAt?? = (k — j)At = te — ty. 


3. For ti < to < tz < ta, 


W (ts) — W(t3) is uncorrelated with W(t2) — W(t1). 


Equation (2.1) provides a way to simulate a standard Brownian motion 
(Wiener process). To see how, consider partitioning [0,1] into n subintervals 
each with length 4. For each number ¢ in 0, 1], let [nt] denotes the greatest 
integer part of it. For example, ifn = 10 and t = 4, then {nt] = [42] = 3. 
Now define a stochastic process in [0,1] as follows. For each t in [0, 1], define 


[nt] 


1 
= - 2 
Sint] Jn > Cty (2 ) 
where €; are i.i.d. standard normal random variables. Clearly, 
1 
Sint] = Stnt]-1 + [ntl Fa? (2.3) 


which is a special form of (2.1) with At = 4 and W(¢) = Sty. Furthermore, 
we know that at ¢t = 1, 


1 n 
Sin = 8S, ==> &, 


has a standard normal distribution. Also by the Central Limit Theorem, we 
know that S, tends to a standard normal random variable in distribution even 
if the €; are only iid. but not necessarily normally distributed. The idea is 
that by taking the limit as n tends to oo, the process Sj,4j would tend to a 
Wiener process in distribution. Consequently, to simulate a sample path of a 
Wiener process, all we need to do is to iterate equation (2.3). Fig. 2.1 shows 
the simulations based on (2.3). 
To generate Fig. 2.1 in SPLUS, type: 


par (mfrow=c(1,1)) 


Brownian motion 
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Fig. 2.1 Sample paths of the process Sinz for different n and the same sequence of 
ej. 


BMsim <- function(npaths ,nSamples) 

{ 

p <- npaths 

N <- nSamples 

y <- matrix(rep(0, (N+1)*p) ,nrow=N+1) 
t <- (c(O:N))/N 

for (j in 1:p) 


{ 

z <- rnorm(N,0,1) 
yfi,j] <- 0 

for (i in 1:N) 

{ 


yliti,j] <- (1/sqrt(N))*sum(z[1:i]) 


wi ww 
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fig <- function(npath) 

{ 

p <- npath 

for(i in 1:p) 

{ 

N <- 107i 

y <- BMsim(1, N) 

t <- (c(O:N))/N 

if(i == 1) f 

matplot(t, y, type = "1", xlab = "t", ylab = "Brownian motion", 
Ity = 1, col = 1,ylim=range(-2.5,2.5)) 
F 

else if(i > 1) 

{ 

lines(t, y) 

3 

} 

i} 

fig(5) 


To generate Fig. 2.2 in SPLUS, type: 


par (mfrow=c(2,2)) 

BMsim <- function(npaths ,nSamples) 

{ 

p <- npaths 

N <- nSamples 

y <- matrix(rep(0, (N+1)*p) ,nrow=N+1) 
t <- (c(0:N))/N 

for (j in 1:p) 


{ 

z <- rnorm(N,0,1) 

yf{1,j] <- 0 

for (i in 1:N) 

{ 

yliti,j] <- (4/sqrt(N))*sum(z[1:i]) 
} 

} 


matplot(t,y,type="1",xlab="t", ylab="Brownian motion" ,lty=1,col=1) 
ts 

BMsim(1,1000) 

BMsim(5 , 1000) 

BMsim(20, 1000) 

BMsim(100, 1000) 


Here are the corresponding Visual Basic codes for simulating Brownian 
motion: 


Brownian motion 


Brownian motion 


1.0 


0.6 


0.2 


0.2 


0.0 


Sub 


0.2 0.4 0.6 0.8 
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Brownian motion 


0.0 0.2 0.4 0.6 0.8 1.0 


Brownian motion 


Fig. 2.2 Sample paths of Brownian motions on [0,1]. 


BMsim() 


Dim npaths As Integer 
Dim nSamples As Integer 


npaths = 10 ’no. 
nSamples = 1000 ’no. 


Dim t() As Double 
ReDim t(0 To nSamples) 


Dim j As Integer 

For j = 0 To nSamples 
t(j) = j / nSamples 

Next j 


Dim S() As Double 


of paths 
of samples in one path 
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ReDim S(O To npaths, 0 To nSamples) 


Dim epsilon As Double 
Dim i As Integer 
For i = 0 To npaths 
epsilon = Application.WorksheetFunction.NormSInv (Rnd) 
S(i, 0) =0 
For j = i To nSamples 
epsilon = Application.WorksheetFunction.NormSInv(Rnd) 
S(i, j) = Si, j - 1) + (1 / Sqr(Samples)) * epsilon 
Next j 
Next i 


Sheets("Sheeti") .Select 
Cells .Select 
Selection.ClearContents 
If i <> i Then 
For i = 1 To npaths 
For j = 0 To nSamples 
Cells(j + 1, i) = S(i, 3) 
Next j 
Next i 
Else 
For j = 0 To nSamples 
Cells(j +1, 1) = S(1, j) 
Next j 
End If 


Cells(i, 1).Select 


End Sub 


In other words, by taking limit as At tends to zero, we get a Wiener process 

(Brownian motion), ie., 
dW (t) = e(t) Vat, 

where e(t) are uncorrelated standard normal random variables. We can in- 
terpret this equation as a continuous-time approximation of the random walk 
model (2.1), see Chan (2002). Of course, such an approximation can be du- 
bious because we do not know if this limiting operation is well defined. In 
more advanced courses in probability, see Billingsley (1999), for example, it 
is shown that this limiting operation is well defined and, indeed, we obtain 
a Wiener process as a limit of the above operation. Formally, we define a 
Wiener process W(t) as a stochastic process as follows. 


Definition 2.1 A Wiener process W(t) is a stochastic process that satisfies 
the following properties: 
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e Fors <t, W(t) — W(s) is a normally distributed random variable with 
mean O and variance t — s. 


e For0<t < te <ts < tas, W(ts) — W(tz) is uncorrelated with W(t2) — 
W(t1). This is known as the independent increment property. 


e W(to) = 0 with probability one. 
From this definition, we can deduce a number of properties. 


1. Fort < s, E(W(s)|W(t)) = E(W(s) -W(t)4+W(t)|W(e)) = Wt). This 
is known as the martingale property of the Brownian motion. 


2. The process W(t) is nowhere differentiable. Consider 


(we - wy.) 2 


s—t SS e. 


This term tends to oo as s —t tends to 0. Hence, the process cannot 
be differentiable and we cannot give a precise mathematical meaning to 
the process dW (t)/dt. 

3. If we formally represent €(t) = ayy) and call it the white noise process, 
we can only use it as a symbol and its mathematical meaning has to 
be interpreted in terms of an integration in the context of a stochastic 
differential equation. 


The idea of Wiener process can be generalized as follows. Consider a pro- 
cess X(t) satisfying the following equation: 


dX (t) = pdt +o dW(t), (2.4) 


where jt and o are constants and W(t) is a Wiener process defined previously. 
If we integrate (2.4) over [0,¢], we get 


X(t) = X(0)+ pt + oW(t), 


ie., the process X(t) satisfies the integral equation 


faxe=uf are [ ave 


The process X(t) is also known as a diffusion process or a generalized Wiener 
process. In this case, the solution X(t) can be written down analytically in 
terms of the parameters p and o and the Wiener process W(t). To extend this 
idea further, we can let the parameters ys and a depend on the process X(t) 
as well. In that case, we have what is known as a general diffusion process or 
an It6’s process. 
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Definition 2.2 An Ité’s process is a stochastic process that is the solution to 
the following stochastic differential equation (SDE): 


dX (t) = p(z,t) dt + o(az,t) dW(t). (2.5) 


In this equation, (x,t) is known as the drift function and o(z,t) is known as 
the volatility function of the underlying process. Of course, we need conditions 
for (x,t) and o(z,t) to ensure (2.5) has a solution. We will not discuss these 
technical details here; further details can be found in Karatzas and Shreve 
(1997) or Dana and Jeanblanc (2002). We will just assume that the drift and 
the volatility are “nice” enough functions so that the existence of a stochastic 
process {X(t)} that satisfies (2.5) is guaranteed. Again, this equation has to 
be interpreted through integration. 


2.3. STOCK PRICE 


Recall the multiplicative model 
log S(k +1) = log S(k) + w(k). 
The continuous-time version of this equation is 
dlog S(t) = vdt+adWi(t). 


The right-hand side of this equation is normally distributed with mean v dt 
and variance o? dt. Solving this equation by integration, 


log S(t) = log S(0) + vt +o W(t). 


Then, E log S(t) = log S(0) + vt. Since the expected log price grows linearly 
with t, just as in a continuous compound interest formula, the process S(t) is 
known as a geometric Brownian motion (GBM). Formally, we define 


Definition 2.3 Let X(t) be a Brownian motion with drift v and variance o?, 


1.€., 
dX (t) = vdt+o0dW(t). 


The process S(t) = eX is called a geometric Brownian motion with drift 
parameter ps, where p= v — 407. In particular, S(t) satisfies 


dS(t) = S(t) dt +oS(t)dW(t), 


and 


dilog S(t) = (u — 5”) dt + a0 dWi{t). (2.6) 
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To simulate 1,000 geometric Brownian motions in SPLUS with = 0.03 


and o” = 0.04, type 


par (mfrow=c(1,1)) 

p <- 1 #no. of paths 

N <- 1000 #no. of samples in one path 
SO <- 1 #current stock price 

mu <- 0.03 #mean value 

Sigma <- 0.2 #standard deviation 

nu <- mu - sigma*2/2 

xX <- matrix(rep(0, (N+1)*p) ,nrow=(N+1)) 
y <- matrix(rep(0, (N+1)*p) ,nrow=(N+1)) 
t <- (c(0:N))/N 

for (j in 1:p) 


{ 

z <- rnorm(N,0,1) 
x{i,j] <- 0 
yf1,j] <- so 

for (i in 1:N) 

{ 


xf{itt,j] <- (1/sqrt(N))*sum(z[1:i]) 

yli+1,j] <- y[1,j]*exp(nust [i+1]+sigma*x[i+1,j]) 
} 

} 


matplot(t,y,type="1",xlab="t",ylab="Geometric Brownian motion") 


A sample path is plotted in Fig. 2.3. The corresponding Visual Basic codes 


for simulating the geometric Brownian motion are: 


Sub GBMsim() 


Dim npaths As Integer 
Dim nSamples As Integer 
Dim SO As Double 

Dim mu As Double 

Dim sigma As Double 


npaths = 1 ’no. of paths 

nSamples = 1000 ‘’no. of samples in one path 
$0 = 1 ’current stock price 

mu = 0.03 ’mean value 

Sigma = 0.2 ’standard deviation 

nu = mu - sigam™2/2 

Dim t() As Double 

ReDim t(O To nSamples) 

Dim j As Integer 


Geometric Brownian motion 


1.00 


0.95 


0.90 


0.85 
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Fig. 2.3 Geometric Brownian motion. 


For j = 0 To nSamples 
t(j) = j / nSamples 
Next j 


Dim X() As Double 
Dim SQ As Double 
ReDim X(O To npaths, 0 To nSamples) 
ReDim S(O To npaths, 0 To nSamples) 
Dim epsilon As Double 
Dim i As Integer 
For i = 0 To npaths 
epsilon = Application.WorksheetFunction.NormSInv (Rnd) 
Xi, 0) 0 
S(i, 0) = SO 
For j = 1 To nSamples 
epsilon = Application.WorksheetFunction.NormSInv (Rnd) 
Xi, j) = XCi, j - 1) + (1 / Sqr(nSamples)) * epsilon 
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S(i, j) = SO * Exp(nu * t(j) + sigma * X(i, j)) 
Next j 
Next i 


Sheets ("Sheeti") .Select 
Cells .Select 
Selection.ClearContents 
If i <> 1 Then 
For i = 1 To npaths 
For j = 0 To nSamples 
Celis(j + 1, i) = SCi, 3) 
Next j 
Next i 
Else 
For j = 0 To nSamples 
Cells(j + 1, 1) = S(1, 3) 
Next j 
End If 


End Sub 


Equivalently, S(t) is a geometric Brownian motion starting at S(0) = z if 


S(t) = zeX (t) = zevttowlt) — zeit 307 )t+oW(t) | 


Using this definition, we see that for to < ty < +++ < tn, the successive ratios 
S(ti) S(tz) _ S(tn) 
S(to)’ S(ti)? —* S(tn—1) 


are independent random variables by virtue of the independent increment 
property of the Wiener process. The mean and variance of a geometric Brow- 
nian motion can be computed as in the lognormal distribution. Notice that 
since a Brownian motion is normally distributed, we conclude: 


1. log S(t) = X(t) ~ N(log $(0) + vt, ot). 
2. As S(t) = S(0)eX™, 

E(S(t)) = E(E(S(t)|$(0) = z)) = E(E(ze"*7)|$(0) = z)) 
= zelt- 2 ter W(t)) 


= zelb- do tR(ervit) — (€ = W(t)/Vt ~ N(0,1)) 


2 
u-to \teport 


= ze! 


zelt = 8(0)eM. 


This equation has an interesting economic implication in the case where 
LL is positive but small relative to o?. On one hand, if » > 0, then the 
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mean value E(S(¢)) tends to oo as t tends to oo. On the other hand, if 
0 <p < 407, then the process X(t) = X(0) + (wu — 50°)t + oW(t) has 
a negative drift, i.e., it is drifting in the negative direction as t tends 
to oo. It is intuitively clear that (which can be shown mathematically) 
X(t) tends to —oo. As a consequence, the original price S(t) = S(0)e* 
tends to 0. The geometric Brownian motion S(t) is drifting closer to zero 
as time goes on, yet its mean value E(S(t)) is continuously increasing. 
This example demonstrates the fact that the mean function sometimes 
can be misleading in describing the process. 


3. Similarly, we can show that 


Var(S(t)) = $(0)%e*+?"t(e7"t — 1) = S(0)?e#4(e7"t — 1). 


2.4 ITO'S FORMULA 


In the preceding section, we define S(t) in terms of log S(t) as a Brownian 
motion. Although such a definition facilitates many of the calculations, it may 
sometimes be desirable to examine the behavior of the original price process 
S(t) directly. To see how this can be done, first recall from calculus that 


_ dS{t) 
We might be tempted to substitute this elementary fact into (2.6) to get 
dS{t) _ 
St) = vdt +oadW{t). Cas 


However, this computation is NOT exactly correct since it involves the dif- 
ferential dW(t). A rule of thumb is that whenever we need to substitute 
quantities regarding dW(t), there is a correction term that needs to be ac- 
counted for. We shall provide an argument of this correction term later. For 
the time being, the correct expression of the previous equation should be 


dS(t) 


ON (y+ 507) dt +0 dW) 


udt+oadWwit), 


as v= pb — $07. The correction term required when transforming log S(t) 


to S(t) is known as the It6’s lemma. We shall talk about this in the next 
theorem. Before doing that, there are a number of remarks. 


Remarks 


1. The term dS(t)/S(t) can be thought of as the differential return of a 
stock and equation (2.7) says that the differential return possesses a 
simple form pdt + odW{t). 
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2. Note that in (2.7), it is an equation about the ratio dS(t)/S(t). This 
term can also be thought of as the instantaneous return of the stock. 
Hence equation (2.7) is describing the dynamics of the instantaneous 
return process. 


3. In the case of a deterministic dynamics, i.e., without the stochastic 
component dW(¢) in (2.7), this equation reduces to the familiar form of 
a compound return. For example, let P(t) denote the price of a bond 
that pays $1 at time t = T. Assume the interest rate r is constant over 
time and there are no other payments before maturity, the price of the 
bond satisfies dP(t) 

t 
——~ =rdt. 
P(t) 
In other words, P(t) = P(0)e™* = e™¢-7), after taking the boundary 
condition P(T) = 1 into account. 


4. Note that equation (2.7) provides a way to simulate the price process 
S(t). Suppose we start at to and let t, = t) + kAt. According to (2.7), 
the simulation equation is 


S(te41) — S(tk) = uS(te) At + oS (t)e(te) VAt, 


where e(t,) are iid. standard normal random variables. Iterating this 
equation we get 


S(teai) = (L+ wAt + cet, VAt]S(te), (2.8) 


which is a multiplicative model, but the coefficient is normal rather 
than lognormal. So this equation does not generate the lognormal price 
distribution. However, when At is sufficiently small, the differences may 
be negligible. 


5. Instead of using (2.7), we can use equation (2.6) for the log prices and 
get 
log S(tk41) — log S(t.) = v At + ce(t,)V At. 


This equation leads to 
S(teya) Set Ary At g(a), (2.9) 


which is also a multiplicative model, but now the random coefficient is 
lognormal. In general, we can use either (2.8) or (2.9) to simulate stock 
prices. 


With these backgrounds, we are now ready to state the celebrated Itd’s 
lemma, which accounts for the correction term. 
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Theorem 2.1 Suppose the random process x(t) satisfies the diffusion equa- 
tion 
dx(t) = a(z,t) dt + b(x, t) dW(t), 


where W(t) is a standard Brownian motion. Let the process y(t) = F(x,t) for 
some function F. Then the process y(t) satisfies the [t6’s equation 


OF OF 1 OF. OF 

— —4+- — 2.1 
(a5%+ oe + a ag *) dt + > bdWi(t). (2.10) 
Proof. Observe that if the process is deterministic, ordinary calculus shows 
that for a function of two variables like y(t) = F(x,t), the total differential 


dy is given by 


dy(t) = 


OF OF 
dy = —dr+—d 
US Spe er 
OF OF 
= —(adt+bdWw) +— dt. 
aE ee 
Comparing this expression with (2.10), we see that there is an extra correction 
term igy /"b in front of dt. To see how this term arises, consider expanding 


the function F’ in a Taylor’s expansion up to terms of first order in At. Note 
that since AW and hence Az are of order V At, such an expansion would lead 
to terms with the second order in Az. In this case, 


OF oF 10°F 
F(a,i) +p At+ a Att sae 5 (Az)? 
2 
F(a, +E (ar+ baw) + oe t+ 5 oe (abt + DAW). 


Now focus at the quadratic expression of the last term. When expanded, it 
becomes 


i 


y+ Ay 


Il 


a? (At)? + 2ab(At)(AW) + b?(AW)?. 


The first two terms of the above expression are of orders higher than At, so 
they can be dropped as we only want terms up to the order of At. The last 
term b?(AW)? is all that remains. Recall that AW ~ N(0, At) (recall the 
earlier fact that dW(t) = e(t)Vdt), it can be shown that (AW)? — At. In 
other words, we have the following approximation 


dW(t)? = dt or dW(t)& Vat. 
Substituting this into the expansion, we have 


OF OF 10°F OF 
y + Ay = F(z,t) + (sat a + 5a gb") Att 9p AW. 


Taking limit as At — 0 and noting y(t) = F(x, t) complete the proof. Oo 
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Example 2.1 Suppose S(t) satisfies the geometric Brownian motion equation 
dS(t) = wS(t) dt + oS(t) dW(t). 


Now use It6’s formula to find the equation governing the process F(S(t)) = 
log S(t). Using (2.10), we identify a = wS and b = oS. Further, we know that 
OF/0S = 1/8 and 0?F/AS? = -1/S*. According to (2.10), we get 

1b? b 


oe 


which agrees with the earlier discussion. Oo 


[ sdW(s). 


To evaluate this integral, let us first guess the answer to be the one given by 
the classical integration by parts formula. That is, we might guess tW(t) — 
i W (s) ds to be the answer. To verify it, we need to differentiate this quantity 
to see if it matches the answer. To do this, use the following steps: 


1. Let X(t) = W(E), then dX(t) = dW(t) and we identify a = 0 andb=1 


dlog S = (3 


Example 2.2 Evaluate 


in (2.10). 

2. Let Y(t) = F(W(t)) = tW(t). Then OF /OW =t, 0?F/OW? = 0, and 
OF /dt = W(t). 

3. Substitute these expressions into It6’s Lemma, we have dY (t) = t dW (t)+ 
W(t) dt. 


4. Integrating the preceding equation, we have 


Y(t)= [ sdW(s) + if W(s) ds, 


that is, 
t t 
i: sdW(s) =tW(E) — | W(s) ds, 
0 ) 


as required. 


Example 2.3 Evaluate 
t 
i. W(s)dW(s). 
0 


First guess an answer, W?(t)/2, say. Is this answer correct? To check, we 
differentiate again and apply Itd’s Lemma. Using the recipe, 


26 BROWNIAN MOTIONS AND ITO’S RULE 


1. Let X(t) = W(t), then dX (t) = dW(t) and we identify a = 0 and b= 1 


in (2.10). 
2. Let Y(t) = F(W(t)) = W2(t)/2. Then OF/OW = W, @F/OW? =1, 
and OF /dt = 0. 


3. Recite Itd’s Lemma: 


OF 18°F 
dY (t) = an = + 


OF 
Ox at | 20X2 ba) 


2] dt + —— 
b*| + a¥ 


so that 


dY (t) = at +W(t)dWi(t). 


4, Integrating the preceding equation, we get 
t t 
W2(t)/2= Y(t) = 5 +f W(s) dW(s). 
0 


In other words, 
t 2 
’. W(s) dW(s) = ve - 5 
0 


5. This time, our initial guess was not correct. We need the extra correction 
term § from Ité’s Lemma. 


o 


Example 2.4 Let W, be a standard Brownian motion and let Y, = Ww3. 
Evaluate dY;. 


Let X; = W, and F(X,t) = X3. Then the diffusion is dX, = dW, with 
a=0and b=1. Further 


Using Itd’s lemma, we have 
ay, = 3W; dt + 3W? dW;. 


Integrating both sides of this equation, we get 


t t t 
[ow [ aw.ass | 3W? dW,, 
0 0 0 


t t 
Y%=we = 3 | Weds +3 [ W?2 dW,, 
0 0 


\\ 
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In other words, 
t w3 t 
[ Ww? dW, = —+ -{ W, ds. 
0 3 1) 


In general, one gets 


t wml t 
[ wr dw, = — -Ff Wie vs 1 SO 1r 8 5.5 
0 Mise b+ '2 ado (2.11) 
im) 
Example 2.5 Let 


Evaluate dlog X,. 


From the given diffusion, we have a = at and b = X;. Let ¥; = F(X,t) = 
log X;. Then 
OF 1 OF _ 1 OF 


ax xox? xa? Gr” 
Using It6é’s lemma, we get dY; = dlog X; = dW. That is, Y, = W; Therefore, 
X, = e™: is a solution to (2.12). q 


Example 2.6 Let the diffusion be 
1 


Evaluate de**. 


From the given diffusion, we have again a = x and b = X;. Let ¥; = 
F(X,t) = e**. Then 


OF om PF _ aX OF. 
BX ORR Oe 
Using Ité’s lemma, we get dY; = e*' dt + e* dW; so that 


dY, = ¥,dt+¥,dw;. 


Example 2.7 Find the solution to the stochastic differential equation 
dX, = X,dt+dW,, Xo = 0. 
Multiplying the integrating factor e~* to both sides of the SDE, we have 
e'dX, =e 'X,dt +e * dW. 
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Let Y; = e7tX;. Then Yo = 0 and by means of Itd’s lemma, we have 
dy; = et adw;. 
Integrating both sides of this equation, 
t 
Yi -Yo= feta, 
0 


so that } 
X,=eY = | elt 5) dW,. 
0 
More generally, if we are given the SDE 


dX, = Xs dt + ao dW,, 


then using the same method by considering the process Y; = e“’X;, it can be 


easily shown that the solution to this SDE is given by the process 


t 
Xi= of ets) dW, + Xo. 
0 


Such a process is known as the Ornstein-Uhlenbeck process, which is often 


used in modeling bond prices. 


2.5 EXERCISES 


1. Let W, be a Wiener process. Now is at time tp. Find the mean and 


variance of X; if 
(a) Xt = O01 (W, - W.,) — 02 (Wi, _ Wi); t > tg >t; > to. 
(b) Xt =0;j (W, = Wi.) ~— 02 (W,, = W,,), t> ti > to > to. 
c) Xe = Wher f(Wy) (Mi; — Wes.) to< ta <-th =F. 
d) Use (c) to show that 


E fi s(W,,7)aW, | =0 


and 


t 2 t 
_ 2 
e| f f(W,.7)aW, | -[ Ef(W,,7)* dr. 


Notice that the above two identities are known as Itd’s identities. 


2. Let X; satisfy the stochastic differential equation 


1 1 
aX; = “3 dt + 5M, 
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where Xp = 0 and W; is a standard Brownian motion process. Define 
S; = e** so that Sp = 1. 


(a) Find the stochastic differential equation that governs S;. 


(b) Simulate 10 independent paths of S; for t = 1,... ,30. Call these 
paths $?,i=1,...,10 and plot them on the same graph. 


(c) What can you conclude about S; for t large? 
(d) With n = 10, evaluate 


7 lou 


at t = 30. 


(e) Simulate 100 independent paths and calculate (2.14) with n = 100. 
What can you conclude about S999 when n tends to infinity? 


3. A stock price is governed by 
dS(t) = aS(t) dt + BS(t) dW(t), 


where a and @ are given constants and W(t) is a standard Brownian 
motion process. Find the stochastic differential equation that governs 


G(t) = /S(@. 


4, Consider a stock price S governed by the geometric Brownian motion 
process 


(8) _ 0.10 dt + 0.30dW(t), 


where W(t) is a standard Brownian motion process, 


(a) Using At = 1/12 and S(0) = 1, simulate 5,000 years of the process 
log S(t) and evaluate 


: log S(t) (2.15) 
as a function of t. Note that (2.15) tends to a limit p. What is 


the theoretical value of p? Does your simulation match with this 
value? 


(b) Evaluate 


= {log S(t) —pt}? (2.16) 


as a function of t. Does this tend to a limit? 
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5. Simulate a standard Brownian motion process W(t) at grids 0 < + < 
2<... < 21 <1 with n = 10,000. Let W, = W(2) fori =0,...,n 
with w(0)= = 0. Suppose you want to evaluate the integral 


1 
| W(s) dW(s) (2.17) 
0 
via the approximating sum 
n-~1 
S.= \{(1- 6) Wi + Wis }{ Wiss — Wi}. (2.18) 
i=0 


(a) Based on simulated values of W;, use (2.18) to evaluate (2.17) with 
€ = 0. Does your result match with the one obtained from It6’s 
formula? 

(b) Based on simulated values of W;, use (2.18) to evaluate (2.17) with 
€= 3. This is known as the Stratonovich integral. Using your cal- 
culated results, can you guess the difference between Itd’s integral 
and the Stratonovich integral? 


6. Let W,; denote a standard Brownian motion process. 
(a) Let Y; = F(W,) = e”. Write down the diffusion equation that 


governs Y;. 
(b) Evaluate fo eWs dW,. 


7. Denote X;, as the Brownian motion with drift and volatility o. 


(a) Find df and dg where f(t, X) =tX, and g(t, X) = tx. 


(b) Financial market practitioners usually consider the time average of 
the underlying asset price when making investment decision. If the 
asset evolves as a Brownian motion X;, then the time average line 
can be viewed as a stochastic variable 


t 
=f X, dr. 
t Jo 


What is the distribution for A;? 


(c) Suppose Xp = 70, u = 0.05, and o = 0.4. Simulate X, and A, 
with At = 0.01. What are the sample means and variances for X, 
and A, for 1,000 simulations? What is the covariance between the 
two random variables, X; and A,? 


(d) Comment on your simulation result. 


Simulation Techniques in Financial Risk Management 
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Black-Scholes Model and 
Option Pricing 


3.1 INTRODUCTION 


In this chapter, we will apply Itd’s Lemma to derive the celebrated option 
pricing formula by Black and Scholes in the early 1970s. This formula has 
far reaching consequences and plays a fundamental role in modern option 
pricing theory. Immediately after Black and Scholes, Merton strengthened 
and improved the option pricing theory in several ways. To recognize their 
contributions, Merton and Scholes were awarded the Nobel prize in economics 
in 1997. 

What is an option? An option is a financial derivative (contingent claim) 
that gives the holder the right (but not the obligation) to buy or to sell 
an asset for a certain price by a certain date. The option gives the holder 
a purchasing right is termed a call option whereas the put option gives the 
holder the selling right. The price in the contract is known as the exercise 
price or strike price (K); the date is known as the expiration or maturity (T). 
American options can be exercised at any time up to expiration. European 
options can be exercised only on the expiration date. As option holders are 
given a right, they have to pay an option premium to enter the contract. This 
premium is usually known as the option price. 

Four basic option positions are possible: 


1. A long position in a call option. Payoff = max(Sr — K,0). 
2. A long position in a put option. Payoff = max(K — Sr,0). 
3. A short position in a call option. Payoff = — max(S7 — K,0). 
31 
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uSo 
tu 
So 
f 
dSo 
fa 


Fig. 3.1 One period binomial tree. 


4. A short position in a put option. Payoff = — max(K — Sr,0). 


Notice that the long position in a put option is different from the short position 
of a call option. A long position in an option always has a non-negative payoff 
whereas a short position in an option always has a non-positive payoff, but 
the option premium is collected up front. Option pricing means determining 
the correct option premium. 

To illustrate the Black-Scholes formula, we shall first discuss some funda- 
mental concepts in a one period binomial model from which a risk-neutral 
argument is introduced. 


3.2. ONE PERIOD BINOMIAL MODEL 


Consider a binomial model in one period. Let Sp and f denote the initial 
price of one share of a stock and an option on the stock. After one period, the 
price of the stock can either be wSo or dSo where u > 1 designates an upward 
movement of the stock price and d < 1 designates a downward movement of 
the stock price. Correspondingly, the payoff of the option after one period 
can either be f, or fg depending on whether the stock moves up or down. 
For instance, f,, = max(Su—K,0) and fg = max(Sd— K,0) for a call option. 
Schematically, the one period outcome can be represented by Fig. 3.1. 

Now consider constructing a hedging portfolio as follows. Suppose we long 


(buy and hold) A shares of the stock and short (sell) one call option (Euro- 
pean). Suppose that the option lasts for one period T and, during the life of 
the option, the stock can either move up from Sp to uSo or down from So to 
dSo. Further, suppose that the risk-free rate in this period is denoted by r. 
The value of this hedging portfolio in the next period is 


AuSo — fu; if stock moves up, 

AdSo — fa, if stock moves down. 
This portfolio will be risk-free if A is chosen so that the value of this portfolio 
is the same at the end of one period regardless of the stock going up or down, 


ie., 


AuSo oarae tu — AdSo = fa. 
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Solving for A, we get 


eee ie 
uSo — dSo 

Since this portfolio is risk-free in the sense that it attains the same value 
regardless of the outcome of the stock, it must earn the risk-free rate. Oth- 
erwise, one could take advantage of an arbitrage opportunity. For example, 
if the return of this hedging portfolio is larger than the risk-free rate, one 
could borrow money from the bank to purchase this portfolio and lock in the 
fixed return. After one period, the proceeds from the portfolio can be used 
to repay the loan and the arbitrageur pockets the difference. Consequently, 
the present value of this portfolio must equal to (AuSp — f,)e7"?. If we let 
f denote the value of the option at present, then the present value of the 
portfolio is SoA — f and according to the no arbitrage assumption, 


SoA — f = (AuSo - fu)e7"?. 
Consequently, 


f = SoA -(AuSo - fue” 
= SpA(1—ue?) + fyew™ 
= Ba Bg vert) + pert 


eT le oie foi — ue?) + fal 


u-d 
ett rT du — fa fa _ (ee Je —d 
Og peg Pea 
er —_d = ent 
eT fun = + fam ) 


—d 


oT phe + t = Difah 


where p = ead This identity has a very natural interpretation. If we let the 
value p, just defied as the probability of the stock, move up in a risk-neutral 
world, then the above formula simply states the fact that, in the risk-neutral 
world, 


f=eTE(f) =e" (p fu + (1) fa), 


ie., the expected value of the option in one period discounted by the risk- 
free rate equals to the present value of the option. Note that the expected 
value in this case is denoted by E, which is the expectation taken under the 
new probability measure p. For this reason, p is known as the risk-neutral 
probability. The same reasoning can be used to evaluate the stock itself. Note 
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that 


pe 
Ae) 
| 


puSo + (1 — p)dSo 
— pSo(u — d) + dSo 


eT _—d 
= So(u — d) + dS, 
u-—d o(u yr aso 


= eT So. 


In other words, the stock grows like a risk-free rate under the risk-neutral 
probability (in the risk-neutral world). Therefore, setting the probability of 
the stock price moving up to be p is tantamount to assuming that the return of 
the stock grows like the risk-free rate in a risk-neutral world. In a risk-neutral 
world, all individuals are indifferent to risk and require no compensation for 
risk. The expected return of all securities is the risk-free interest rate. It is 
for this reason that such a computation is usually known as the risk-neutral 
valuation and it is equivalent to the no arbitrage assumption in general. 


Example 3.1 Suppose the current price of one share of a stock is $20 and 
in a period of three months, the price will either be $22 or $18. Suppose we 
sold a European call option with a strike price of $21 in three months. Let 
the annual risk-free rate be 12% and let p denote the probability that the stock 
moves up in 3 months in the risk-neutral world. Note that the payoff of the 
option is either f,, = $1 if the stock moves up or fa = $0 if the stock moves 
down. How much is the option, f, worth today? To find f, we can use the 
risk-neutral valuation method. Recall that from the above discussion, 


22p + 18(1 — p) = 20e%1?/4, 
so that p = 0.6523. Using the expected payoff of the option, we get 
E(f) = pfu + (1 ~ p)fa = p + (1 — p)0 = p = 0.6523. 
Therefore, the value of the option for today is 
fse"TE(f) =e P/4(pfy + (1 ~ p) fa) = 0.633. 


O 


Alternatively, we can try to solve the same problem using the arbitrage-free 
argument. 


Example 3.2 With the same parameters as in the preceding example, con- 
sider solving for A. First, since we want a risk-free profit for the hedging 
portfolio, we want to purchase A shares of the stock and short one European 
call option expiring in three months. After three months, the value of the 
portfolio can either be 


22A—1, if the stock price moves to $22, 
or 
18A, if the stock price moves to $18. 
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This portfolio is risk-free if A is chosen so that the value of the portfolio 
remains the same for both alternatives, 1.e., 


22A-—1=18A which means A = 0.25. 
The value of the portfolio in three months becomes 
22 x 0.25 -1 = 4.5 = 18 x 0.25. 


By the no arbitrage consideration, this risk-free profit must earn the risk-free 
interest rate. In other words, the value of the portfolio today must equal to 
the present value of $4.5, i.e., 4.5e7919/4 = 4.367. If the value of the option 
today is denoted by f, then the present value of the portfolio equals to 


20 x 0.25 — f = 4.5e7° 12/4 = 4.367. 


Solving for f gives 
f = 0.633, 


which matches with the answer of the preceding example. Oo. 


In general, this principle can be applied to a multi-period binomial tree. We 
will not go into the analysis of a multi-period model and refer the readers to 
Chapter 11 of Hull (2006) for further details. For a comprehensive discussion 
on the discrete-time approach, see Pliska (1997). Although these two examples 
are illustrated with a call option, by the same token, the same principle can 
be used to price a put option, again details can be found in Hull (2006). 


3.3 THE BLACK-SCHOLES-MERTON EQUATION 


The Black-Scholes option pricing equation has initiated modern theory of 
finance. Its development has triggered an enormous amount of research and 
revolutionized the practice of finance. The equation was developed under 
the assumption that the price fluctuation of the underlying security can be 
described by a diffusion process studied earlier. The logic behind the equation 
is conceptually identical to the binomial lattice: at each moment two available 
securities are combined to construct a portfolio that reproduces the local 
behavior of a contingent claim. Historically, the Black-Scholes theory predates 
the binomial lattice. 

To begin, let S denote the price of an underlying security (stock) governed 
by a geometric Brownian motion over a time interval {0, 7] by 


dS = pS dt+aS dw, (3.1) 


where W is a standard Brownian motion process. Assume further that there 
is also a risk-free asset (bond) carrying an interest rate r over the time interval 
(0, 7] such that 


dB =rB dt. (3.2) 


36 BLACK-SCHOLES MODEL AND OPTION PRICING 


Consider a contingent claim that is a derivative (call option) of S. The price 
of this derivative is a function of S and t, ie., let f(S,t) be the price of the 
claim at time ¢t when the stock price is S. Our goal is to find an equation 
that models the behavior of f(S,t). This goal is attained by the celebrated 
Black-Scholes-Merton equation. 


Theorem 3.1 Using the notation just defined, and assuming that the price 
and the bond are described by the geometric Brownian motion (3.1) and the 
compound interest rate model (3.2) respectively, then the price of the derivative 
of this security satisfies 


2 
at ag7t ange? © ath (3.3) 
Proof. The idea of this proof is the same as the binomial lattice. In deriving 
the binomial model, we form a portfolio with portions of the stock and the 
bond so that the portfolio exactly matches the return characteristics of the 
derivative in a period-by-period manner. In the continuous-time framework, 
the matching is done at each instant. Specifically, by It6’s lemma, recall that 
2 

af = (Fh us + FE + SS 02%) dt + SF osaw. (3.4) 
This is also a diffusion process for f with drift (26 .S + of + ie oS?) and 

diffusion coefficient of os. 

Construct a portfolio of S and B that replicates the behavior of the deriva- 
tive. At each time t, we select an amount 2; of the stock and an amount y% 
of the bond, giving a total portfolio value of G(¢) = x,S(t) + y%,B(t). We wish 
to select xz, and y, so that G(¢) replicates the derivative value f(S,t). The 
instantaneous gain in value of this portfolio due to changes in security prices 
is 


dG 


Ii 


z.dS+y,dB 
= 2(uSdS +oSdWw) + yrB dt 
(czS + yr B) dt + rx0S dW. (3.5) 


Since we want the portfolio gain of G(t) to behave like the gain of f, we match 
the coefficients of dt and dW in (3.5) to those of (3.4). First, we match the 
coefficient of dW in these two equations and we get 


= eh 
~ as" 


Second, since G(t) = 2,S(t) + y:B(t), we get 


Lt 


Me = Gy ~ 250). 
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Third, remember we want G = f, therefore, 


n= Bey (16.9 - 3450). 


Substituting this expression into (3.5) and matching the coefficient of dt in 
(3.4), we have 

of 1 Of g of Of 10°f 028? 

S+—=— S Bit) = pS+——4+= 

Fes + ae (4 aa syst) (= 554 +t pag? mc 

Consequently, 
re) of 6) OF LOPF. “ois 
0 

Remarks 

1. If f(S,t) = S, then of = 0, 3f = 1, and rf = 0 and equation (3.3) 


reduces to rS = rS so that f(S,t) = S is a solution to (3.3). 


2. As another simple example, consider a bond where f(S,t) = e”’. This 


is a trivial derivative of S and it can be easily shown that this f satisfies 
(3.3). 


3. In general, (3.3) provides a way to price a derivative by using the ap- 


propriate boundary conditions. Consider a European call option with 
strike price K and maturity T. Let the price be C(S,t). Clearly, this 
derivative must satisfy 


c(0,t) = 0, 
C(S,T) = max(S — K,0). 
For a European put option, the boundary conditions are 
P(o,t) = 90, 
P(S,T) = max(K —S,0). 


Other derivatives may have different boundary conditions. For a knock- 
out option that will be canceled if the underlying asset breaches a pre- 
specified barrier level (H), in addition to the above conditions, we have 
an extra boundary condition 


f(S = H,t) =0. 


. With these boundary conditions, one can try to solve for the function f 


from the Black-Scholes equation. One problem is that this is a partial 
differential equation and there is no guarantee that an analytical solution 
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exists. Except in the simple case of a European option, one cannot find 
an analytic formula for the function f. In practice, either simulation or 
numerical methods have to be used to find an approximate solution. 


5. Alternatively, we can derive equation (3.3) as follows. Construct a port- 
folio that consists of shorting one derivative and longing oF shares of 
the stock. Let the value of this portfolio be II and let the value of the 
derivative be f(S,t). Then 


oF g 


= 3.6 
-f+558 (3.6) 
The change ATI in the value of this portfolio in the time interval At is 

given by 
= —Af +36 val g AS: (3.7) 


Recall that S follows a geometric Brownian motion so that 
AS = wS At + aS AW. 


Also, from (3.4), the discrete version of df is 


a of a 
Af = (FE us + aE as 07S”) At + of oS AW. 


Substituting these two expressions into equation (3.7), we get 


S?) At. (3.8) 


Note that by holding such a portfolio, the random component AW has 
been eliminated completely. Because this equation does not involve 
AW, this portfolio must equal to the risk-free rate during the time At. 


Consequently, 
ATI = rIIl At, 
where r is the risk-free rate. In other words, using (3.8) and (3.6), we 
obtain 5f 
pris 292 of 

— + -35 0°S*) At= — = S)At 

et gage? SI Atm — ae) A 
Therefore, 

a) re) Lo? 
es ae f rS += Laer =rf. 


at ° Os 2 as? ° 


It should be noted that the portfolio used in deriving (3.3) is not perma- 
nently risk-free. It is risk-free only for an infinitesimally short period of time. 
As S and t change, of also changes. To keep the portfolio risk-free, we have to 
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change the relative proportions of the derivative and the stock in the portfolio 
continuously. 


Example 3.3 Let f denote the price of a forward contract on a non-dividend- 
paying stock with delivery price K and delivery date T. Its price at time t is 
given by 


f(S,t) =S—Ke"7-9, (3.9) 


Hence, 


Of sey JOE Of 
A ee r(T-t) CY Sd a 
Dt rKe 3g 1, and 552 0. 


Substituting these into (3.3), we get 
—rKe "TD 479 =rf. 


Thus, the price formula of f given by (3.9) is a solution of the Black-Scholes 
equation, indicating that (3.9) is the correct formula. oO 


The Black-Scholes equation generates two important insights. The first one 
is the concept of risk-neutral pricing. As the Black-Scholes equation does not 
involve the drift, u, of the underlying asset price, the option pricing formula 
should be independent of the drift. Therefore, individual preferences toward 
the performance or the trend of a particular asset price does not affect the 
current price of the option on that asset. The second insight is that one would 
be able to derive a price representation of a European option with any payoff 
function from the equation. It is summarized in the following theorem. 


Theorem 3.2 Consider a European option with payoff F(S) and expiration 
time T. Suppose the continuous compounding interest rate is r. Then, the 
current European option price is determined by 


f(S,0) =e""TE[F(Sr)], . (3.10) 


where E denotes the expectation under the risk-neutral probability that is de- 
rived from the risk-neutral process 


dS 

— rdt+adWi(t). (3.11) 
Proof. Notice that the current price of the option f(S,0) is a deterministic 
function of time t = 0 and the current asset price S. Consider a stochastic 
process {X;} that satisfies 


Xo =S and So =rdt+oadWi(t). 
t 
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Then, f(S,0) = f(X,0). Consider the process f(X,t) derived from the 
stochastic process of {X;}. By Ité’s lemma, the differential form of f is 


o 1 o ) 
df = (+ f ee otk em eS ai) a+ ox ob dW. 
The Black-Scholes equation says that the coefficient of dt is identical to the 


term rf, see Theorem 3.1. The total differential for the pricing function is 
simplified as 


of 
df = rf dt +oX ay dW, 
which implies 
Jo ge OF 
df —rf dt =oX>y dW. 


The left-hand side of the above equation can be combined with the product 
rule of differentiation to yield 


edie" f(X,t)| = ox oe dW. 


This expression has an equivalent integration form, 
T of 
eT f(Xr,T) 1X0) =o fo tx se aw. 
0 OX 


The right-hand side is a sum of Gaussian processes so that it has an expected 
value of zero. A fter taking expectation on both sides, 


B [e-"? f(X7,T) — f(X,0)] = 


This implies a 
f(X,0) =e" Elf (Xr, T)]. 


By the terminal condition specified in the Black-Scholes equation, f(Xr,T) = 
F (Xr), the payoff of the option contract. Hence, we have 


f(S,0) =e" E[F(Xr)], 


where the expectation with respect to the random variable Xr is called the 
risk-neutral expectation and the process {X;} is called the risk-neutral asset 
dynamics. To avoid confusion, financial economists always use the term “asset 
price process in the risk-neutral world (S;)” to represent the X; in this proof. 
It establishes (3.10) and (3.11) and completes the proof. 
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3.4 BLACK-SCHOLES FORMULA 


We are now ready to state the pricing formula of a European call option. A 
corresponding formula can also be deduced for a European put option. We 
first establish a key fact about lognormal random variables. 


Lemma 3.1 Let S be a lognormally distributed random variable such that 
log S ~ N(m,v?) and let K > 0 be a given constant. Then 


E(max{S — K,0}) = E(S)®(d,) — K®(d2), (3.12) 


where ®(-) denotes the distribution function of a standard normal random 
variable and 


< d fect Se 
dy = 2 (log K +412) = Log (BIZ) +2), 
—~logK+m 1 S. v 
ass = eRe — Log (eZ) - 5). 


Proof. Let g({s) denote the p.d.f. of the random variable S. Then 
CO co 
E(max(S — K,0)) = i; max(s — K,0)g(s) ds = / (s — K)g(s) ds. 
0 K 
By definition, since log S ~ N(m,v?), 
E(S) = e("+2) s0 that logE(S) =m + =v’. 


2 
Define the variable Q as 


O= 


Eg ge that Om N01); 
VY 


The p.d-f. of Q is given by ¢(q) = fueF, the p.d.f. of a standard normal 


random variable. Since g = leg sam s=e™t® go that dq = a Therefore, 


E(max(S — K,0)) 


ll 


M max(s — K,0)g(s) ds 
K 


CO 
= ie K as — K)g(e™*%) sv dq 
p Uog A-—m 
CO 
3 (log K-m 
= i e™*4 d(q) dq — K | (9) dq 
4 (log K—m) 2 (Qog K—m) 


ee oe oe (3.13) 
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Note that the third equality follows from the fact that the p.d.f. g of a 
lognormal random variable S has the form 


log s — 


g(s) = (ES), 50 that g(e™*)sv = Gq). (8.14) 


Vy 


We now analyze each of the terms I and IJ in (3.13). Consider the first term, 


p2 oe 
I = aa o(q — v) d(q —v) 
= (log K-—m)—v 


= emt g(PER a _ yy) 
= emt g( — eK +m +0). 
For the second term, we have 
w=K f° p(q)dqg = Ka(— EAE), 
4 (log K—m) V 


Substituting these two expressions into (3.13), we have 


—logkK +m 
Vv 


E(max(S — K,0)) = &"+F &/ $v) - KS 


SER, 
7 : 


Observe that since log E(S/K) = — log K +m+ a 


7 = 2 
logK +m |, a logK +m+v 
y Vv 
1 vy? 
= —(] ae 
/ log E(S/K) + 5) 
pos) dy. 


Similarly, it can be easily shown that 


_ -logK+m 
= 5 ; 


dz 
This completes the proof of the lemma. Oo 


Using this lemma, we are now ready to state the Black-Scholes pricing 
formula. 


Theorem 3.3 Consider a European call option with strike price K and expi- 
ration time T. If the underlying stock pays no dividends during the time {0,T] 
and if there is a continuously compounded risk-free rate r, then the price of 
this contract at time 0, f(S,0) = C(S,0), is given by 


C(S,0) = $&(d,) — Ke~"T 8(dy), (3.15) 
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where ®(x) denotes the cumulative distribution function of a standard normal 
random variable evaluated at the point x, 


= O T o — 
dy = [log(S/K) + (7 +07/2)0\—e, 
dy = [log(S/K) + (r-0°/2)T]| a 


dy; im oVT. 


Proof. The proof of this result relies on the risk-neutral valuation. By 
Theorem 3.2, we have 


C(S) = e~'7 E(max{Sr — K,0}), (3.16) 


where Sp denotes the stock price at time 7, E denotes the risk-neutral ex- 
pectation, and 


dS =rSdt+oSdw, (3.17) 
In this case, we have 
ESp = Soe"?. (3.18) 
From the preceding lemma, we get 
E(max{Sp — K,0}) = E(Sr)®(d1) — K (dg). 


The remaining job is to identify d,, do, and E(Sp). By construction, ESp = 
Soe". Recall from (3.17), we can easily deduce from It6’s lemma that 


* 1 
dlog S; =ydt+odW,, with y=r-—- so (3.19) 


Consequently, 


a 


1 
m = E(log Sy) = logSo+rT — 50 
vy» = Var(log Sr) = 0°T. 


According to the lemma, 


—log K+m+4+v? 
ze tate ee, ste eT 
oVT g 0 5) 
1 


- —pilloel =) +(rt+ 577)T 
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By similar substitutions, it can be easily shown that 


Pea cy ae ae 8 /2\(1))— 24, LevT. 


This completes the proof of the Black-Scholes formula (3.15). O 


Example 3.4 Consider a five-month European call option on an underlying 
stock with a current price of $62, strike price $60, annual risk-free rate 10%, 
and the volatility of this stock is 20% per year. In this case, S = 62,K = 
60,r =0.1,0 = 0.2, and T = 3. Applying (3.15), we get 


1 62 027.5 

ie on eas ao 

: oa /a7i3 [860 ( 2 13 
= 0.641287, 


dy = dy —0.2\/5/12 = 0.512188. 


From the normal table, we get ®(di) = 0.739332 and ®(d,) = 0.695740. 
Consequently, 


C = (62)(0.739332) — (60)e~ 6/12) (9.695740) = 5.798. 


Remarks 


1. Note that the Black-Scholes pricing formula is derived using a risk- 
neutral valuation argument here. Alternatively, for a given derivative 
such as a European call option, we can try to solve the partial differential 
equation(PDE) given by the Black-Scholes equation (3.3) subject to the 
explicit boundary conditions given in Remark 3 in Section 6.3. This was 
the original idea of Black and Scholes and it is commonly known as the 
PDE approach. Although feasible, due to the complexity of the PDE of 
the Black-Scholes equation, the risk-neutral valuation argument offers a 
more intuitive approach based on the arbitrage-free argument. 


2. For a European put option, the corresponding pricing formula is given 
by 
P = Ke~" &(—dz) — Sp®(—d;), 


where r, K,d,, and dy are defined as in (3.15). 


3. To interpret the Black-Scholes formula, look at what happens to d; and 
dz asT — 0. If So > K, they both tend to co so that ®(d;) = @(d2) = 1 
and @(—d,) = ®(—d2) = 0. This means that 


C=S)—-—K and P=0. 


3.5 
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On the other hand, if Sg < K, the reverse argument shows d; and dg 
tend to —oo as T — 0 so that 


C=0 and P=K — So. 


Is this reasonable? When Sg > K, and when T = 0, the call option 
should be worth Sp — K and the put option is of course worthless. On 
the other hand, if So < K and T = 0, the put option should be worth 
K — So and the call option becomes worthless. Thus, the Black-Scholes 
formula offers the price that is consistent with the boundary condition. 


. What happens when T — co? In this case, dj = dz = co and C = 


So, P = 0. This is known as the perpetual call. If we own the call for a 
long time, the stock value will almost certainly increase to a very large 
value so that the strike price K is irrelevant. Hence, if we own the call 
we could obtain the stock later for essentially nothing, duplicating the 
position we would have if we initially bought the stock. Thus, C = So. 


. The Black-Scholes formula is derived for a European call option under 


the situation where the stock pays no dividends. When the underlying 
stock does pay dividends at a specific time during the life of the option, 
a similar formula to price the option can also be deduced. Again, we 
refer the interested readers to Hull (2006) for further details. 


. For an American option where early exercise is allowed, one can no 


longer find an exact analytic formula such as (3.15) for the price of a 
call. Instead, a range of possible values can be deduced and details are 
given in Hull (2006). 


. In using the Black-Scholes formula, one important quantity required is 


the value of o, the volatility or the risk of the underlying stock. To use 
the formula, we can estimate o from the historical data and put this 
estimate into the Black-Scholes equation. Such an approach is known 
as the historical volatility approach. On the other hand, one can also 
use the Black-Scholes formula to imply the value of 0, known as the 
implied volatility. In this latter approach, we substitute the observed 
price of the derivative as the real price into the Black-Scholes formula to 
solve for o, giving it the name of implied volatility. This quantity can be 
used to monitor the market’s opinion about the volatility of a particular 
stock. Analysts often calculate implied volatilities from actively traded 
options on a certain stock and use them to calculate the price of a less 
actively traded option on the same stock. 


EXERCISES 


. A company’s share price is now $60. Six month from now, it will be 


either $75 with risk-neutral probability 0.7 or $50 with risk-neutral prob- 
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ability 0.3. A call option exists on the stock that can be exercised only 
at the end of six months with exercise price of $65. 


(a) If you wish to establish a perfectly hedged position, what would 
you do? 


(b) Under each of the two possibilities, what will be the value of your 
hedged position? 


(c) What is the expected value of option price at the end of the period? 
(d) What is the reasonable option price today? 


2. Consider the binomial model of Section 3.2. 


(a) Show that the European call option price of the two period model 
is given by 
co = [p*cun + 2p(1 — p)cua + (1 — p)?caa] E77", 
where T is the option maturity and 


Cun = max(Su? — K,0) 
Cud = max(Sud— K,0) 
Cad = max(Sd? ~ K,0). 
(b) Show by induction that the n-period call price is given by 


n 
Cn = mae a {nCjq (1 — q)"~? max (Suid”~7 — K,0)}. 
j=0 


(c) Cox, Ross, and Rubinstein [CRR, 1979] propose that u = ervat 
and d = e~°V4t , where o is the annualized asset volatility, are 
respectively appropriate choices for the upward and downward fac- 
tors in implementing the binomial model. Show that 


lim cy, = S ®(d,) — Ke~"? 8(dg), 


the Black-Scholes call price, if the CRR proposal is adopted. 


3. By Theorem 3.2, show the put-call parity relation 


pt+S=c+ Ke. 


4. A fixed strike geometric Asian call option has the payoff function max(G7y— 


K,0) where 
1 77 
Gr = exp zi oe) 
T Jo 
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By Theorem 3.2 and Lemma 3.1, determine the analytical solution for 
the fixed strike geometric Asian call option. (Hints: 1. Apply the result 
of question 7(b) of Chapter 2, 2. You can find the answer in Chapter 
7.) 
. Consider the partial differential equation: 
Of 1.4, Of Of 

t, 2) ——- pe = 
at 5? ( 12) As + p(t, 2) ae + a(t, x)f 0 

f(T, 2) F(z). 


By modifying the proof of Theorem 3.2, show that 


f(t,z)=E ae ar)dr eX) , 


where X7 is the solution to the SDE: 
dX = p(t, X)dr + o(7,X)dW,, X, = 2. 
This result is called the Feyman-Kac formula. 


. Suppose the risk-free interest rate and the volatility of an asset are 
deterministic functions of time. That means, 


r=r(t) and o=a(t). 
(a) Show that the Black-Scholes equation governing European option 


prices, f(t, S), is given by 


a oP a 
af ES 5 oreo t + r(t)soe —r(t)f =0. 


(b) Show that the European call option price satisfies: 


“a 


f(t, 8) =e Le 4" Btmax(Sp — K,0)], 
where 
dS, = r(r)S, dr + o(r)S;dW,, 7 >t, and S =S. 
Hint: Use the result of question 5. 
(c) Hence, show that 
f(t, S) = Cas(t,S;r =f,o =8), 


where Cgg is the Black-Scholes formula for call option with con- 
stant parameters, 


ee ee as : Gee caer 
r=75/ r(t)dr and o= raf, o?(r) dr. 
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7. A stochastic process X(t) is said to be a martingale under a probability 
measure P if E?{X(t)|X(s),s <7] = X(r), with probability one. 
(a) Consider the asset price dynamics under the risk-neutral measure: 


dS =rSdt+aS dw. 


Show that X(t) = S(t)e~™ is a martingale. 


(b) Denote C(t, S;T) as the Black-Scholes formula for a European call 
option with maturity T. Show that Ce"T-9 is a martingale. 


Simulation Techniques in Financial Risk Management 
by Ngai Hang Chan and Hoi Ying Wong 
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Generating Random 
Variables 


4.1 INTRODUCTION 


The first stage of simulation is generation of random numbers. Random num- 
bers serve as the building block of simulation. The second stage of simulation 
is generation of random variables based on random numbers. This includes 
generating both discrete and continuous random variables of known distri- 
butions. In this chapter, we shall study techniques for generating random 
variables. 


4.2 RANDOM NUMBERS 


Random numbers can be generated in a number of ways. For example, they 
were generated manually or mechanically by spinning wheels or rolling dice in 
the old days. Of course, the notion of randomness may be a subjective judg- 
ment. Things that look apparently random may not be random according 
to the strict definition. The modern approach is to use a computer to gen- 
erate pseudo-random numbers successively. These pseudo-random numbers, 
although deterministically generated, constitute a sequence of values having 
the appearance of uniformly (0, 1) distributed random variables. 

One of the most popular devices to generate uniform random numbers is 
the congruential generator. Starting with an initial value zo, called the seed, 
the computer successively calculates the values z,,n > 1 via 


In = AXn—1 +c modulo m, (4.1) 
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where a,c, and m are given positive integers, and the equality means that 
the value atn—1 is divided by m and the remainder is taken as the value of 
In. Each rp is either 0,1,...,m—1 and the quantity 7 is taken as an 
approximation to the values of a uniform (0, 1) random variable. Since each 
of the numbers rz, assumes one of the values of 0,1,...,m—1, it follows that 
after some finite number of generated values a value must repeat, itself. For 
example, if we take a = c= 1 and m = 16, then 


In = In—-1 +1 modulo 16. 
With xo = 1, then the range of x, is the set 
{0, 1, 2, 3, 4,5, 6, 7, 8,9, 10, 11, 12, 13, 14,15,0,...}. 
When a = 5, c= 1, and m = 16, then the range of z, becomes 
{0, 1,6, 15, 12, 13, 2,11, 8, 9, 14, 7, 4, 5, 10, 3,0,...}. 


We usually want to choose a and m such that for any given seed xg, the 
number of variables that can be generated before repetition occurs is large. 
In practice, one may choose m = 2°! — 1 and a = 7°, where the number 31 
corresponds to the bit size of the machine. 

Any set of pseudo-random numbers will by definition fail on some problems. 
It is therefore desirable to have a second generator available for comparison. 
In this case, it may be useful to compare results for a fundamentally different 
generator. 

From now on, we will assume that we can generate a sequence of random 
numbers that can be taken as an approximation to the values of a sequence 
of independent uniform (0, 1) random variables. We will not explore the 
technical details about the construction of good generators, interested reader 
may consult L’Ecuyer (1994) for a survey of random number generators. 


4.3. DISCRETE RANDOM VARIABLES 


A discrete random variable X is specified by its probability mass function 
given by 
PX Saja J SO Nt Soop Sd. (4.2) 
j 
To generate X, generate a random number U, which is uniformly distributed 
in (0, 1) and set 


to if U< po, 
ry if po <U<po+pi, 


: = 
xz; if ae BysUK< Se Pi, 
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Recall that for0<a<b<1, Pla<U <b) =b-—a. Thus, 
j-1 j 
P(X =2))=P(} pi SU <>) om) = py, (4.3) 
i=l i=1 


so that X has the desired distribution. Note that if the x; are ordered so 
that zo < 21 < --: and if F denotes the distribution function of X, then 


F (xr) = a p; and so 
X equals tox; if F(xj-1) < U < F(z;). 


That is, after generating U, we determine the value of X by finding the interval 
[F(a;~-1), F(x;)) in which U lies. This also means that we want to find the 
inverse of F(U) and thus the name of inverse transform. 


Example 4.1 Suppose we want to generate a binomial random variable X 
with parameters n and p. 


The probability mass function of X is given by 


p= P(X =i)= ;pPi(l—p)?, i= 0,1,...,0. 


(nm — 2) 
From this probability mass function, we see that 


nm-t p 
7_, Pi 


Pit1 = Ee p 
The algorithm goes as follows: 

1. Generate U 

2. If U < po, set X = 0 and stop 


3. If pp < U <po+ pi, set X =1 and stop 


4. pp +-+-+pn-1 < U <pot---+Dn, set X =n and stop 


Recursively, by letting i be the current value of X, pr = pj = P(X = 1), and 
F = F(i) = P(X < 1), the probability that X is less than or equal to 7, the 
above algorithm can be succinctly written as: 


STEP 1: Generate U 
STEP 2: c=p/(1—p), t=0, pr=(1-p)", F=pr 
STEP 3: If U < F, set X =i and stop 
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STEP 4: pr = [c(n —1)/(t+1)|pr, F=F +pr,i=itl 
STEP 5: Go to Step 3 0 


To generate a a binomial random variable X with parameters n = 10 and 
p =0.7 in SPLUS, type: 


n <- 10 

p <- 0.7 

U <- runif(1,0,1) 
¢ <- p/(i-p) 
i<- 0 

pr <- (1-p)*n 
f <- pr 

for (i in 0:n) 
{ 

if (U < f) 

{ 

X <- i 

break 

} 

else 

{ 

pr <- c*#(n-i)/ (iti) *pr 
f£ <- f+pr 

} 

} 

xX 


4.4 ACCEPTANCE-REJECTION METHOD 


In the preceding example, we see how the inverse transform can be used to gen- 
erate a known discrete distribution. For most of the standard distributions, 
we can simulate their values easily by means of standard built-in routines 
available in standard packages. But when we move away from standard dis- 
tributions, simulating values become more involved. One of the most useful 
methods is the acceptance-rejection algorithm. 

Suppose we have an efficient method, e.g., a computer package, to simulate 
a random variable Y having probability mass function {q;,7 > 0}. We can use 
this as a basis for simulating a distribution X having probability mass function 
{p;,j = 0} by first simulating Y and then accepting this simulated value with 
a probability proportional to py/gy. Specifically, let c be a constant such 
that 


z <¢ for all j such that p; > 0. 
J 


ACCEPTANCE-REJECTION METHOD 53 


Then we can simulate the values of X having probability mass function p; = 
P(X = j) as follows: 


STEP 1: Simulate the value of Y from q; 
STEP 2: Generate a uniform random number U 


STEP 3: If U < py/(cqy), set X = y and stop. Otherwise, go to Step 
1. 


Theorem 4.1 The acceptance-rejection algorithm generates a random vari- 
able X such that 
P(X =j)=p;, j=9,1,.... 


In addition, the number of iterations of the algorithm needed to obtain X is 
a geometric.random variable with mean c. 


Proof. First consider the probability that a single iteration produces the 
accepted value 7. Note that 
P(Y =j, it is accepted) = P(Y = 7)P(accepted|Y = j) 
= gPU <p;/(cq;)) 
= gjp;/(cq) 
= p;/e. 


Summing over 7, we get the probability that a generated random variable is 
accepted as 


P(accepted) = yy p;/e=1/e. 


Since each iteration independently results in an accepted value with probabil- 
ity 1/c, the number of iterations needed is geometric with mean c. Finally, 


P(X =j) = S > PG accepted on iteration n) 
= Yi - 1/0)" "pile = By. a 


Example 4.2 Suppose we want to simulate a random variable X taking val- 
ues in {1,2,...,10} with probabilities as follows: 


a 1 2 3 4 5 6 7 8 9 10 
P(X =i) O11 012 0.09 0.08 012 0.1 0.09 0.11 0.07 0.11 


Using the acceptance-rejection method, first generate discrete uniform ran- 
dom variables over the integers {1,... ,10}. That is, P(Y = j) = qj = 1/10 
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for j = 1,...,10. First, compute the number c by setting c = max Z = 1.2. 
Now generate a discrete uniform random variable Y by letting Y = {10U,}+1, 
where U; ~ U(0,1). Then generate another U2 ~ U(0,1) and compare if 
U2 < py/(cqy). If this condition is satisfied, then X = Y is the simulated 
value. Otherwise, repeat the steps again. 0 
The code in SPLUS is as follows: 


k <- 1000 

x <- c(rep(0,k)) 

ul <- c(rep(0,k)) 

u2 <- c(rep(0,k)) 

y <- c(rep(NA,k)) 

ci <- 0 

n <- 10 

p <- c(0.11,0.12,0.09,0.08,0.12,0.1,0.09,0.11,0.07,0.11) 

q <- c(rep(i/n,10)) 

b <- (max(p/q))/n 

for (i in 1:k) 

{ 
uifi]<- trunc(runif (1,0,1)*n)+1 
u2[i]<- runif(i,0,1) 


if (u2[i]<=p[ui[i]]/b) x{i]<-ui [i] 
if (x[i]==0) c1 <- ci+1 
else yli] <- x[i] 
} 
cl 
hist(na.omit(y) ,prob=T,breaks=((0:10)+0.5)) 
lines (density (na.omit(y))) 


4.5 CONTINUOUS RANDOM VARIABLES 


Generating continuous random variables is very similar to generating discrete 
random variables. It again relies on two main approaches using uniform ran- 
dom numbers: the inverse transform and the acceptance-rejection method. 


4.5.1 Inverse Transform 


Theorem 4.2 Let U be a uniform (0, 1) random variable. For any continu- 
ous distribution function F’, the random variable X defined by X = F-1(U) 
has distribution F. Here 


F-i(u) = inf{z : F(x) > u}. 
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Proof. Let Fx denote the distribution of X = F-1(U). Then 


Fx(z) = P(X <x) 
=P 
=) 
= F 


Example 4.3 Let X be an exponential distribution with rate 1. Then its 
distribution function is given by F(x) = 1—e7*. Let x = F7(u), then 
u = F(x) = 1—-—e7*, so that cr = —log(1—u). Thus, we can generate X 
by generating U and setting X = —log(1 —U). Moreover, since (1 — U) has 
the same distribution as U, which is uniform (0, 1), we can simply set X = 
—logU. Finally, it can be seen easily that if Y ~ exp(A), then E(Y) = 1/A 
and Y = X/X, where X ~ exp(1). In this case, we can simulate Y by first 
simulating U and setting Y = —+ log U. Oo 


The previous example illustrates how to apply the inverse transform method 
when the inverse of F can be written down easily. The next example demon- 
strates the case when the inverse of F’ is not readily available. 


Example 4.4 Let X ~[(n,). Then it has distribution function 


y= [ dew Ay)" 1 7 


~ (n=1)l 


Clearly, finding the inverse of Fx is not feasible. But recall that X = pana Yi, 
where Y; ~ T(1,A) i.i.d. Furthermore, each Y; has distribution function 


y 
Fy) = [ de ds, 
0 


which is the distribution function of an exponential distribution with rate x. 
Therefore, we can generate X via 


log U, = —5 log (ti -+-U,). 0 


To generate a random variable X that follows a gamma distribution with 
parameters n = 5 and A = 10 in SPLUS, type: 


n<- 5 

lambda <- 10 

U <- runif(n,0,1) 

X <- ~(1/lambda) *sum(log(U[1:n])) 
X 


The message from these two examples is that, although the inverse trans- 
form method is simple, we may need to conduct certain simplifications before 
applying the method. 
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4.5.2. The Rejection Method 


Suppose that we can simulate from a density g easily. We can use this as a 
basis to simulate from a density f(x) by first generating Y from g and then 
accepting the generated value with probability proportional to f(Y)/g(Y). 
Specifically, let c be such that 


— <c for all y. 


Then we generate from f via the following algorithm: 
STEP 1: Generate Y from a density g 


STEP 2: Generate a uniform random number U 


STEP 3: If U < £4 := h(y) set X =Y, else go to Step 1 


This is exactly the same acceptance-rejection method in the discrete case. 
Correspondingly, we have the following result whose proof is almost the same 
as in the discrete case. 


Theorem 4.3 The random variable X generated by the rejection method has 
density f. Moreover, the number of iterations that this algorithm needs ts a 
geometric random variable with mean c. 


Proof. Let f(x) = cg(x)h(x), where c > 1 is a constant, g(x) is also a p.df. 
and 0 < h(x) <1. Let Y have p.d.f. g and U ~ U(0,1). Consider 


PU ShWY)IY = x)g(z) 


fy(2|U < h(Y)) = PU < R(Y)) 


For the first part in the numerator, we have 
PU <MY)I¥ =2) = PU <h(e)) = A(z). 


For the denominator, consider 


I 


PU <hY)) es 


fx 
— 


Therefore, fy(ajU < A(Y)) = h(z)g(x)c = f(z). o 


Hl 


h(x)g(x) d: 


f(x 
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One of the difficulties in using the rejection method is determining the 
constant c. Our goal is to find the function cg{x) so that cg(x) > f(x) and 
sample easily from the density g(z). This can be achieved using trial-by-error, 
or in certain circumstances, can be achieved by simple analysis, as illustrated 
in the following example. 


Example 4.5 Suppose we want to simulate from the density 
f(x) = 20x(1 —2x)?, O< 2 <1. 


First note that f is defined only on the interval (0,1). We may try g that can 
be simulated easily over the same interval, uniform (0, 1), say, that is, g(x) = 
1,0 <a <1. To determine the smallest number c such that f(x)/g(x) < ¢ 
for all0 < x <1, we first find the mazimum value of the ratio f(x)/g(x) = 
202(1—2)3. Using calculus, differentiating and setting to zero, 


4 (f@)\ _ 4 
dx \ g(x) 
we solve x = 1/4 to be the maximum of f/g. Thus 


f(x) - 
oe 20(1/4) (3/4)? = 135/64 = c. 


Therefore, 


H2) _ 99(64)/(135)e(1 — 2)°. 


The algorithm becomes: 
STEP 1: Generate random numbers U, and U2 


STEP 2: If Up < 28U1(1 — U;)%, stop and set X = Uj, 
O77 


else go to Step 1 OD 


To simulate from this distribution, use the following code: 


k <- 1000 

x <- c(rep(0,k)) 

ui <- c(rep(0,k)) 

u2 <- c(rep(0,k)) 

cl <- 0 

y <~ cCrep(NA,k)) 

for (i in 1:k) 

{ 
ui(i] <- runif(1,0,1) 
u2fi] <- runif(1,0,1) 
if (u2fi] <= (256/27) *ul [i] *(1-uil[i])*3) x[i)<-u1 Li] 
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if (x[i]==0) c1<-ci+i 
} 
c1 
#c1 counts the number of rejected values 
for (i in 1:k) 


{ 

if (xfi]!=+0) yfi] <- xfil] 
} 
y [1:20] 


#only the values of y with y not equaling to NA are plotted 
hist (na.omit(y) ,prob=T) 
lines (density(na.omit(y))) 
for (i in 1:k) 
{ 
f£([i] <- 20*(i/k)*(1-i/k)°3 
} 
plot(f,type="1") 


4.5.3 Multivariate Normal 


An important application of simulation is to handle high dimensional prob- 
lems. High dimensional problems are usually related to multivariate normal 
distributions (Gaussian distribution). However, most software packages do 
not provide algorithms for generating multivariate normal random variables. 
This section studies algorithms for generating multivariate normal random 
variables. 

A random vector X is said to follow a multivariate normal distribution if 
all of its elements are normal random variables. The distribution of X is then 
described as 


X ~ N(m,%), (4.4) 
where m = E[X] is the mean vector and © = Var[X] is the variance- 
covariance matrix. Consider a vector X = (X1,... , Xn)? with Xj ~ N(pi, 0?). 
In this case, the mean vector m = (ji1,.-.,ftn)’ and the n x n matrix 


= [Cov(Xy, Xa), 4.9 = Asacs ym 

There is a convenient way to generate a normal random vector X when © = 
I. & =I indicates that the elements of X are independent random variables. 
Therefore, we can generate X; independently and then stack them up to form 
the vector X. For a normal random vector with dependent components, i.e., 
» # I, decomposition methods are useful. 


4.5.3.1 Cholesky Decomposition ‘The first method is the Cholesky decompo- 
sition. Consider two correlated standard normal random variables X, and Xo 
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with correlation coefficient p, written as, 


«-[B]-s(S}(5 ¢)) 


Theorem 4.4 Correlated random variables X, and X2 can be decomposed 
into two uncorrelated random variables Z, and Z_ through the linear trans- 
formation: 


40°= * 
Fic ee Pte pX1 
1 — p? 
In other words, 
1 0 
me as 


_ a-n([3].[5 2) 


Proof. As X; and X2 are linear combinations of normal random variables, 
they are also normally distributed. Furthermore, 


E(X1) = E(X2) =0 
Var(X1) 1, Var(X2) = (1- p?) Var(Z2) 4. fa Var(Z1) =1 
Cov(X1,X2) = Cov(Z,, Z2\/1 — p? + pZ)) = p. 


Thus, X; and X» have the desired distribution. | 


The linear transformation of (4.5) is called the Cholesky decomposition. It 
enables us to generate (X1, X2) by the following procedures. 


STEP 1: Generate Z;, Z2 ~ N(0,1) iid. 
STEP 2: Set X,; = Z, and X2 = Zo\/1 — p + pZy. 


In fact, there is a Cholesky decomposition for N(m,). Since U is a semi- 
positive definite matrix, i.e., v7 Xv > 0 for all vector v, there exists a lower 
triangular matrix L such that © = LL’. The Cholesky decomposition is an 
algorithm to obtain this lower triangular matrix L. 

For n x n matrices 5 = [a;;| and L = [l,;], the Cholesky decomposition 
algorithm works as follows. 


STEP 1: Set li, = ./aiy. 
STEP 2: For J = 2, .. 7m set Ly = aj31/li1. 
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STEP 3: For i = 2,...,n—1 conduct STEP 4 and STEP 5. 


19 1/2 
STEP 4: Set l;; = [ais 2 pat 3] 7 


it 


STEP 5: For j=i+1,...,n, set l= Jai = ey Lyelit) 


1/2 
STEP 6: Set Inn = [ann ~ Desi el - 
The algorithm can be implemented with a VBA code: 


Function nCholesky (Correlation) 

Creating the Cholesky Decomposition Factors 

Dim mRs As Single 

mRs = nCountRC(Correlation, True) ’Correlation.Rows.Count 


Dim aCholesky() As Double ’Cholesky Decomposition Matrix 
ReDim aCholesky(1 To mRs, 1 To mRs) As Double 


For i = 1 To mRs 
aCholesky(i, 1) = Correlation(i, 1) 


Next i 

For i = 2 To mRs 
For j = 2 Toi 
If i = j Then 


aCholesky2 = 0 

For k= 1iToj-i 

aCholesky2 = aCholesky2 + aCholesky(i, k) ~ 2 

Next k 

aCholesky(i, j) = Sqr(i - aCholesky2) 

Else 

aCholeskyA = 0 

For k=iToj-i 

aCholeskyA = aCholeskyA + aCholesky(i, k) * aCholesky(j, k) 
Next k 

aCholesky(i, j) = (Correlation(i, j) - aCholeskyA) / aCholesky(j, k) 
End If 

Next j 

Next i 

nCholesky = aCholesky 

End Function 


In SpLuS, Cholesky decomposition is solved with the subroutine ‘chol()’. 
Given the matrix L, a random vector X ~ N(m,%) is generated by 


X=m+LZ, Z~ N(O,J) (4.6) 
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The following SpLUS code generates an n-dimensional multivariate normal 
random vector with mean m and variance-covariance matrix bi. 


for (i in t:n)f{ 
Z[i] <- rnorm(1,0,1) 
} 
X <- m + t(chol(Sigma) )%*%Z 


Theorem 4.5 The X obtained in (4.6) follows N(m,%X). 


Proof. The random vector X has a Gaussian distribution as it is a linear 
combination of Gaussian random variables. Therefore, it suffices to check the 
mean and variance of X. For the mean, 


E[X] =m+E(LZ] =m. 
For the variance, 
Var[X] = Var[LZ] = L (Var[Z]) L? = LLT = ¥. 
O 


Example 4.6 Consider a portfolio of three assets: P(t) = S1(t) + 2S2(t) + 
393(t). The current assets values are S,(0) = 100, S2(0) = 60 and S3(0) = 30. 
Suppose rate of returns of three assets follow a multivariate normal distribu- 
tion. Specifically, we let 


Si(t + At) — S;(t) 


Si(6) and R(t) = (Ra(t), Ra(t), Ra(t))", 


Ri(t) = 


where 


O.1At + 0.2VAEX, 
R(t) = | —0.03At + 0.4VAtX» |, 
0.2AE + 0.25VALX3 


XxX 0 1 -0.1 0.2 
X_g | ~N O1,| -0.1 1 O01 
X3 0 0.2 O1 1 

Simulate 10 sample paths of the portfolio with At = 1/100. 


The SPLUS code for the problem is given as follows: 


n <- 10 
N <- 100 
dt<- 1/N 


#input the variance-covariance matrix Sigma 
Sigma<-matrix(c(1,-0.1,0.2,-0.1,1,0.1,0.2,0.1,1), nrow=3, ncol=3) 
L<- t(chol (Sigma) ) 
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stock<-matrix(0,3,N+1) 
stock[,1]<-c(100,60,30) #initial stock price 
P<-matrix(0,n,N+1) 


for(k in i:n){ 

for(i in 1:N){ 

X <- Lhr%rnorm(3,0,1) 

R<- c(0.1,-0.03,0.2)*dt + c(0.2,0.4,0.25) *sqrt (dt) *X 
stock[,it+i]<- stock[,i]*(1+R) #Generate asset prices paths 
} 

P[k, ]<-stock[1,]+2*stock[2,]+3+*stock[3,] 

- 


t<-c(0:N)/N 


#plot 10 sample paths for the portfolio 
plot(t,P[1,],type="1", ylim=c (240,420) ,pty=0,xlab="t", ylab="P(t)") 
for(k in 2:n) f{lines(t, P[k,])} 


#plot the last sample paths of individual assets and the portfolio 
plot(t,P[n,] ,type="1",ylim=c(0,400) ,pty=0,xlab="t",ylab="prices") 
for(k in 1:3) flines(t, stock[k,], lty=2) } 


Two graphs are produced by the programme. Fig.4.1 plots 10 portfolio 
sample paths against time. Fig. 4.2 plots one sample path for each individual 
assets and one sample path of the portfolio. Asset and the portfolio can be 
identified by their initial values. 


4.5.3.2 Eigenvalue Decomposition The second method is the eigenvalue de- 
composition. Given an n x n matrix &, if a constant value \ and a non-zero 
vector vu satisfy: 


Lv = dv, (4.7) 


then » is called an eigenvalue of the matrix © and v is the corresponding 
eigenvector. In principle, there are n eigenvalues for an n x n matrix. For the 
varlance-covariance matrix , we know that all eigenvalues are non-negative 
and eigenvectors are orthogonal because & is semi-positive definite. 

In multivariate analysis, eigenvalues of a variance-covariance matrix © are 
arranged in descending order as Ay > Ag > --- > Ay and the correspond- 
ing eigenvectors are chosen to have unit length. This means ||v,|| = 1 for 
a= 1,2,...,n. Under these specifications, v1 is called the first principle com- 
ponent, v2 is the second principle component and so on. More importantly, 
the matrix © can be decomposed into a product of three square matrices: 


Me PDP", (4.8) 
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Fig. 4.1 Sample paths of the portfolio. 


where P = [v1, v2,... , Un| and D = diag(A1, A2,... , An) is a diagonal matrix. 
In SPLUS, eigenvalues and eigenvectors are easily obtained with the subroutine 
‘eigen()’. 


Theorem 4.6 [f Z ~N(0,/), then X =m+P/DZ ~ N(m,>). 
Proof. Again, it suffices to check the mean and variance of X. For the mean, 
E[X]=m+E[PVDZ] =m. 
For the variance, 
Var{X] = Var[PVDZ] = PVD (Var[Z]) [PVD]? = PDPT = ¥. 


Oo 
The following is the SPLUS code for generating Gaussian normal random 
vectors by the eigenvalue decomposition. 


for (i in i:n){ 

Z{i] <- rnorm(1,0,1) 

} 

A <- (eigen(Sigma) $vectors) %*/diag (sqrt (eigen (Sigma) $values) ) 
X <- m + AZ*YLZ 
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prices 


Fig. 4.2 Sample paths of the assets and the portfolio. 


Remarks: VBA users may worry about matrix operations used in the above 
algorithms. Fortunately, there are free downloads available on the Web that 
provide necessary subroutines under the platform of EXCEL. For instance, the 
PoPTools, from http://www.cse.csiro.au/poptools/index.htm, includes rou- 
tines of Cholesky and eigenvalue decompositions. 


4.6 EXERCISES 


1. Using the inverse transform method to generate a random variable X 
with the probability mass function. 


(b) P(X = 9) = (ntj-1)Cj(1 -— pp", 7 = 0,1,2..., where n and p 
are given parameters. 


2. We simulate X,Y, Z from an inverse transform algorithm. Suppose U ~ 
U(0,1). Determine the distributions of the following random variables: 
(a) X = int(loU(1 — U)) 
(b) Y = int(1/U) 
(c) Z=(B-3)?, Bw Bin(5,0.5). 
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3. Determine the p.d.f. of 


(a) X =—10logU +5 

(b) X = 2tan(xU) + 10 

(c) W =nU — int(nU). Show that it is independent of [ = int(nU). 
(Hint: Show that P(W < w,I =i) =w/n.) 


4. Let X have probability mass function 


i 1 -F Be Ue es Ue 
P(X =i) 03 012 0.09 012 01 0.17 0.1 


(a) Use the acceptance-rejection algorithm, simulate 1,000 data points 
from this distribution. You may use a discrete uniform as your g. 


(b) Plot out the histogram of your simulation. 


(c) What is the expected number of acceptance for this distribution? 
Does that match your simulation results? 


5. Suppose we want to simulate from the density 
f(z) =24+1/2,0<2<1. 


(a) Using the inverse transformation method, simulate 1,000 values 
from f. 

(b) Using the acceptance-rejection method, simulate another 1,000 val- 
ues from f. Which algorithm is more efficient? 


6. Suppose we want to simulate |Z|, where Z ~ N(0,1). That is, the 
absolute value of a standard normal random variable. First note that 
the p.d-f. of |Z| is given by 


2 2 
xr) = —=e"* /?7, 0<2<00. 
f(z) Jin 
Suppose you want to use the acceptance-rejection algorithm to simulate 
|Z|. Take g to be the exponential distribution, 


x 


g(x) =e*, 0<4<om. 


(a) Determine the value c such that c = max nee 


(b) Use the acceptance-rejection method, simulate 1,000 values of |Z]. 


(c) Suppose you want to recover Z from the simulated values of |Z]. 
One way to do it is to generate a random number U and set 


zgef lal #U>1/2, 
rh seiZh iPS aD 


Using this method, obtain 1,000 values of Z and plot its density. 
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Standard Simulations in 
Risk Management 


5.1 INTRODUCTION 


Risk management applications require simulation experiments. In this chap- 
ter, we introduce some standard simulation techniques and discuss their ap- 
plications in risk management. 


5.2 SCENARIO ANALYSIS 


Scenario analysis of risk management refers to simulating possible scenarios 
to analyze the risk of a decision and consequences. The ultimate goal of a 
scenario analysis may be to reach a decision, to verify a model or to validate 
a certain conjecture. 

Suppose a newspaper boy buys a newspaper from an agent for $4 each and 
sells it for $6. His problem is to decide how many newspapers to buy each 
morning. In other words, what would be a prudent purchasing strategy? 

To analyze the situation, he examines the sales record for the past 100 days 
given in Table 5.1. After reviewing the data in Table 5.1, he comes up with 
the following strategies: 


1. Each day, purchase the same number of papers sold the day before. 


2. Each day, purchase a fixed number of papers, say 23. 
67 
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Number of newspapers Days occurring 


21 15 
22 20 
23 30 
24 21 
25 14 


Table 5.1 Sales record. 


To test each of these two strategies, one could simulate the scenarios using 
inverse transform. First convert the information in Table 5.1 into the empir- 
ical probability mass function (p.m.f.): 


Number of newspapers p.m.f. Cumulative distribution 


21 0.15 0.15 
22 0.20 0.35 
23 0.30 0.65 
24 0.21 0.86 
20 0.14 1.00 


Table 5.2 Probability mass function. 


Now simulate 10 future days and compare the two policies following the 
p.m.f. given in Table 5.2. The simulation draws a standard uniform random 
variables u. The demands of newspaper are generated according to where the 
random variables fall. For instance, if u = 0.17, which belongs to the range of 
0.15 to 0.35, then the corresponding demand is 22. To have a fair comparison, 
assume that the newspaper boy orders 23 papers on Day 1. Table 5.3 lists the 
results of the simulation. The interval [0,1] is partitioned according to the 
cumulative frequency in Table 5.2. According to Table 5.3, policy 2 is better 
than policy 1. One can repeat the simulation for many times to see if this 
phenomenon is consistent. 

The newspaper boy problem illustrates several important elements in sce- 
nario analysis. Decision makers identify possible scenarios based on empirical 
data or experience. In this example, scenarios correspond to the daily demand 
of newspapers. Simulation is then developed to replicate future possibilities. 
We use the inverse transform with the empirical density function in this ex- 
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u~U(0,1) Number of newspapers Profit of 1 Profit of 2 


Day 1 0.5828 23 $46 $46 
Day 2 0.0235 21 $34 $34 
Day 3 0.5155 23 $42 $46 
Day 4 0.3340 22 $40 $40 
Day 5 0.4329 23 $44 $46 
Day 6 0.2259 22 $40 $40 
Day 7 0.5798 23 $44 $46 
Day 8 0.7604 24 $46 $46 
Day 9 0.8298 24 $48 $46 
Day 10 0.6405 23 $42 $46 

Total Profit = $426 $436 


Table 5.3 Policy simulation and evaluation. 


ample. After generating scenarios, a risk manager analyzes consequences cor- 
responding to each scenario. If the first policy is adopted, then the number of 
newspapers purchased equals the number sold yesterday; otherwise, 23 papers 
are purchased. Finally, evaluation and comparison can be conducted using the 
simulated results. 


5.2.1 Value at Risk 


In finance, risk scenario analysis is usually conducted for evaluating value-at- 
risk (VaR), a widely adopted risk measure 


Definition 5.1 VaR summarizes the worst loss of a portfolio over a target 
horizon with a given level of confidence. 


Statistically speaking, VaR describes the specified quantile or percentile of the 
projected distribution of profits and losses over the target horizon. Let R; be 
the return of a portfolio for a horizon t. Then, the c% confidence VaR of the 
portfolio is measured through the expression: 


P(R; < —VaR) = (1—¢)% := a. (5.1) 


Hence, VaR is the negative of the a-th percentile of the probability distri- 
bution of profits and losses. The larger the VaR, the higher the risk of the 
portfolio. An advantage of VaR is that it allows the user to specify the con- 
fidence level to reflect individual risk-averseness. For more details, see Jorion 
(2000). 

VaR. is indispensable for market risk analysis because it is the number that 
splits future possible asset returns into two scenarios: risky and nonrisky. 
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Returns less than the negative of VaR belong to the class of risky scenario. 
Decision makers can evaluate their policies by examining consequences under 
the risky scenario. For instance, a bank may check if it maintains enough 
money for an extremely risky situation. 

A conventional way to measure VaR often assumes portfolio returns to 
follow a normal distribution. VaR obtained in this way is called normal VaR. 
A typical model is 


Rp=pt+oZ, Z~N(0,1). (5.2) 
In such a parametric model, it is easy to derive that 
VaRa(t) = —Za0 — H, (5.3) 


where Zq is the a-quantile of the standard normal distribution, p is the drift 
and o is the standard deviation of the return R; over the horizon t. 

Though one can prove (5.3) mathematically, we would like to verify it by 
simulation. The algorithm is given as follows. 


STEP 1: Generate n independent standard normal random variables, 
namely Z; ~ N(0,1) iid., 7 =1,2,...,n. 


STEP 2: Set Rj = p+o0Z;. 
STEP 3: Rank {R;, Ro,--- , Ry} in ascending order as {RR}, R},--- , Rx}. 
STEP 4: Set VaR = —Rj, where k = int(a x n). 


Example 5.1 Let w = 0.003, 0 = 0.23, a = 5%, and n = 10,000. Then, 
the 95% VaR corresponds to the 500th smallest return generated from the 
simulation. Our simulation shows that the VaR = 0.3783, which is close to 
the value, 0.3753, obtained by (5.3). The SPLUS code is as follows: 


n <- 10000 

alp <- 0.05 

k <- round(n*alp) 
mu <- 0.003 


sigma <- 0.23 
R <- rnorm(n,mu, sigma) 
SR <- sort(R) 
VaR <- -SR[k] 


5.2.2 Heavy-Tailed Distribution 


in reality, returns of market prices may not follow a normal distribution but 
a heavy-tailed distribution. This means the two tails of the empirical density 
decay less rapidly than the normal density. Since closed-form solution for the 
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VaR of a heavy-tailed distribution is not readily available, a feasible alterna- 

tive is to generate random variables according to a heavy-tailed distribution. 
One commonly used form for heavy-tailed distribution is the generalized 

error distribution (GED). The p.d.f. of GED with parameter € is given by 


gexp (—} [2/Al) 


f(z) = WHET G/E (5.4) 
_ fezerayey]” 
a= Fae] 


where ['(-) denotes the Gamma function. Fig. 5.1 plots the p.d.f. of GED and 
Fig. 5.2 zooms in at the left-tail of the density function. It is seen that the 
smaller the € is, the heavier the left-tail of the density function is. 


0.7 =r — —— T T T ————T 


al 


0.5 


ia 


f(z) 


standard normal 
0.3F 


ie) 
—4 -3 -2 -1 0 1 2 3 4 
z 


Fig. 5.1 The shape of GED density function. 


The key to simulate VaR is to generate random variables following the de- 
sired distribution. Here, we apply the rejection method introduced in Chapter 
4 using an exponential distribution for g. The algorithm goes as follows. 


STEP 1: Generate Y ~ Exp(1). 
STEP 2: Generate U ~ U(0,1). 
STEP 3: If U < 2f(Y)e” /a, then Z = Y; else go to STEP 1. 
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0.06 


0.05 


0.04 


0.02 


0.01 standard normal 


sr 38 -36  -34 32 3 28 26 -24 -22 2 


Fig. 5.2. Left tail of GED. 


STEP 4: Generate V ~ U(0,1). If V < —-1/2, then Z = —-Y. 
STEP 5: Repeat STEP 1 - 4 for n times to get {Z1, Zo,..., Zn}. 
STEP 6: Set Rj =p+oZ;. 
STEP 7: Sort the returns in ascending order as { Rj, R4,...,R*}. 
STEP 8: Set VaR = —Rj where k = int(a x n). 
Remarks: 
1. In Step 3, a is a constant no less than max,{2f(y)e¥}. 


2. As the exponential distribution is defined with a domain of positive real 
numbers, Steps 1 to 3 of the algorithm generate positive GED. Step 4 
converts a positive GED random variable into a GED random variable. 


5.2.3. Case Study: VaR of Dow Jones 


We demonstrate the use of GED-VaR by considering 10-year daily closing 
prices of Dow Jones Industrial Index (DJI) in the period of August 8, 1995, 
to August 7, 2004. Data downloaded from http://finance.yahoo.com consists 
of 2,265 prices. The prices are converted into 2,264 daily returns by the 


formula: 
St — S14 


fe Sy-4 
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Sample mean and standard deviation of the returns are 0.04% and 1.16% in a 
daily scale, respectively. From (5.3), the 95% and 99% normal VaR from the 
sample are 1.87% and 2.66%, respectively. 


normal quantiles 


-0.08 -0.06 -0.04 ~0.02 0 0.02 0.04 0.06 0.08 
DJ returns quantiles 


Fig. 5.3 QQ plot of normal quantiles against daily Dow Jones returns. 


To access the quality of normal VaR, one has to test the normality as- 
sumption or, more precisely, the distributional assumption used in the VaR 
computation. Here, we introduce a simple but valuable tool, known as the 
quantile-quantile (QQ) plot. The idea is to plot the quantiles of the sample 
returns against the quantiles of the distribution used. If the returns truly 
follow the target distribution, then the graph should look like a straight line. 
For testing normality, the target distribution is the normal distribution. Sys- 
tematic deviations from the line signal that the returns are not well described 
by the normal distribution. 

Fig. 5.3 shows a QQ plot of our sample against the normal distribution. 
Large deviations are observed by the two tails of the empirical data. Specif- 
ically, the empirical quantile is less than the normal quantile in the left tail 
but larger than the normal quantile in the right tail. The deviations strongly 
suggest heavy-tailed distribution from the empirical data. 

We use GED to reduce the deviation from the QQ plot. Returns are first 
standardized by the sample mean and standard deviation as 


R,z — 0.04% 


SRt= —T76% 
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where SR; denotes the standardized return at time t. We conjecture that 
SR, ~ GED(E), identically and independently. The parameter € is estimated 
from the SR using maximum likelihood estimation. Our estimation shows 
that € = 1.21 (see Appendix). Then, GED-VaR is estimated from the eight- 
step algorithm in Section 5.2.1, where the constant a is required in Step 3. 
The value of a can be deduced from the plot of 2f(y)e¥ against y, where f(y) 
is the p.d.f. of GE.D(1.21). Fig. 5.4 shows that the maximum function value 
is bounded above by 1.2 so that we set a = 1.2. 


Fig. 5.4 Determine the maximum of 2f(y)e¥ graphically. 


We write the SPLUS code for simulating GED-VaR. 


HHHHHHHHHH positive GEDpdf(v) #####HHHHH 

funGED<-function(x,v){ 

lamda<-((2* (-2/v) *gamma(1/v))/gamma(3/v))*0.5 
positiveGED<-2*(v*exp(-0.5* (x/lamda) “v) )/(gamma(1/v)*lamda*2* (i+1/v)) 
} 

funEXP<-function(x){ 

EXP<-exp(-x) 

} 


N<-10000 
VaR95<-c(rep(0,1000)) 
VaR99<-c(rep(0, 1000) ) 
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for (j in 1:1000){ 
#A program for generating GED by 
#using rejection method (Proposal Density is exp(1)) 
# First generate positive GED distribution 
N<-10000 #number of random number we want to generate 
e<-c(rep(0,N)) 
for (i in 1:N){ 
u2<- 1 #initializing u2 for STEP 2 
Y <- 2 
while (u2>funGED(Y,1.21)/(funEXP(Y)*1.2)){ #Check conditions for STEP 3 
ui<-runif(1) #STEP 1 
Y<- -log(u1) #STEP 1: Generate the exp(1) 
u2<-runif(1) #STEP 2 
e[i]<-Y #STEP 3 


. 

} 

#Then generate GED by assign half of the 
#positive GED with negative sign 

for (k in i:N){ 

u3<-runif (1) 


if (u3<0.5){ 
e({k]<- -e[k] #STEP 4 


} 

} 

R<-(0.04+1.16*e)/100 #STEP 6 

sR<-sort (R) #STEP 7 

VaR95[j]<- - sR{N*0.05] #STEP 8 for 95% VaR 
VaR99[j]<- - sRIN*0.01] #STEP 8 for 99% VaR 
} 


###Repeat the whole algorithm for 10,000 times to get the C.1.### 


sVaR95<-sort (VaR95) 

sVaR99<-sort (VaR99) 

E95VaR<-mean (VaR95) 
UCIVaR95<-sVaR95[1000*0.975] #95% upper CI 
LCIVaR95<-sVaR95[1000*0.025] #95% lower CI 
E99VaR<-mean (VaR99) 
UCIVaR99<-sVaR99[1000*0.975] #99% upper CI 
LCIVaR99<-sVaR99[1000*0.025] #99% lower CI 
E95VaR 

UCIVaR95 

LCIVaR95 

E99VaR 
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UCIVaR99 
LCIVaRS99 


Simulated GED(1.21) quantiles 


-8 
-0.08 -0.06 -0.04 -0.02 i?) 0.02 0.04 0.06 0.08 
DJ returns quantiles 


Fig. 5.5 QQ plot GED(1.21) quantiles against Dow Jones return quantiles. 


The program estimates 95% and 99% VaR by generating 10,000 GED(1.21) 
random variables. For the confidence intervals, it repeats the process 1,000 
times to get 1,000 VaR estimates. After arranging the simulated VaRs in 
ascending order, the 95% two-tailed confidence interval (CI) is the range be- 
tween the 25th VaR and the 975th VaR. 

To check the performance of GED-VaR, we use the QQ-plot of Fig.5.5 
based on one simulation. It is seen that deviations from the straight line 
have been substantially reduced. From this exercise, we see that GED(1.21) 
is appropriate for modeling the sample of Dow Jones returns. The average 
95% VaR and 99% VaR from the 1,000 simulation are 1.87% and 3.02%, 
respectively. Therefore, 95% GED-VaR and 95% normal VaR give similar 
values whereas 99% GED-VaR. is 10% more than the 99% normal VaR. 

These findings may be useful for a risk manager. As normal VaR. is com- 
monly used in the financial industry, it is essential for a risk manager to 
understand the limitation of the normal VaR. The rationale of this empirical 
study is that normal VaR is a good estimate for potential losses of a portfolio 
under “normal, nonextreme” scenarios. However, it underestimates potential 
losses when “extreme events” happen, especially for those happening with 
probability less than 1%. To measure VaR with higher confidence level, e.g., 
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99% VaR, the risk manager may consider GED-VaR. For further discussion 
about extreme values, see Embrechts, Kliippelberg, and Mikosch(1997) and 
the themed volume of Finkenstadt and Rootzén (2004). 


5.3. STANDARD MONTE CARLO 


In the preceding chapters, we studied the idea of simulating random variables. 
One of the main reasons to simulate random variables is to estimate quantities 
like E(X), which is related to the evaluation of definite integrals. Suppose 
we have already generated n values of a random variable X, it would be 
very natural to estimate the quantity 9 = E(X) by Xn = 2577_, Xi. We 
shall study some standard statistical techniques to assess the accuracy of 
such an estimate, which are based on the law of large numbers and the central 
limit theorem. Whenever we estimate quantities like E(X) based on standard 
applications of simulations, we refer these methods as standard Monte Carlo 
simulations. We shall study other more sophisticated simulation methods in 
later chapters. 


5.3.1 Mean, Variance, and Interval Estimation 


Suppose that X is a given random variable with mean @ and variance 0”. A 
‘natural way to evaluate 6 = E(X) using simulations is to generate random 


values X1,...,Xn and calculate the quantity 
sk, 
, Se 7) 
n= 
which is called the sample mean of {Xj,... , Xn}. It is easy to see that 
E(X,) = E(X) = 8, unbiasedness property, (5.5) 
Var 43) = (5.6) 


To assess the accuracy of X,, as an estimate of 6, we rely on two important 
results. The first one is the law of large numbers, which asserts that as the 
number of simulations n gets bigger, the closer is Xn to 0, see, for example, 
Casella and Berger (2001). Specifically, 


Theorem 5.1 Let Xi,...,Xn be aid. random variables with mean 0 and 
variance o*. Then for any given « > 0, 


P(|Xn -6| > 6) ~ 0 asn > 0. 


This result is sometimes written as X, — 6 in probability. 
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The second one is the central limit theorem, which asserts that as n tends 
to infinity, the distribution of the random variable X,, behaves like a normal 
distribution approximately. 


Theorem 5.2 Let X1,...,Xn be t.i.d. random variables with mean @ and 
variance 0? > 0. Then as n tends to infinity 


Pyne < 2) = 02), 


where ®(z) denotes the c.d.f. of a standard normal distribution evaluated at 
the point z. 


A equivalent definition of this result is that the random variable /n(X,—8)/o 
converges in distribution to Z, written as 


ge =) > Z, 
(on 


where Z ~ N(0,1). The proof of these two results can be found in standard 
text books in probability, see Billingsley (2001) for example. One immediate 
application of the central limit theorem is to construct approximate confidence 
intervals for 6. According to Theorem 5.2, 


P( [Xn — 6) > Teo) ~ PZ} 2) = 2(1 ~ (2). 
As a result, if we let c = 1.96, then the probability of X,, differs from @ by more 
than 1.960/./n would be approximately equal to 0.05. In other words, we are 
relatively confident (95%) that our estimate is within two standard errors 
(1.96¢/,/n) from 6. To make use of this result, we have to have knowledge 
about the value o, which is usually unavailable. A simple fix is to estimate it 
from the simulated values. The sample variance, which is defined as 


constitutes an estimate of a7. It can be easily shown that 


E(S*) = o7, unbiasedness property, (5.7) 
Sie = (1-1/9) 85 + G+ 1) (Kj — X5). (5.8) 


One frequently asked question in simulations is that after simulating X and 
evaluating X,, when should we stop? The answer to this question is given by 
the following scheme: 


1. Choose an appropriate value d for the standard deviation of the esti- 
mation. That is, d represents the margin of error we can tolerate using 
simulations. 


STANDARD MONTE CARLO 79 


2. Generate at least 100 values of X. 


3. Continue generating X and stopping when we have k& values of X such 


that S/Vk < d. 
4. The desired estimate is given by Xx. 


Finally, we can form an interval estimation for @ by using the notion of 
confidence intervals. 


Definition 5.2 If X, = #,S = s, then the interval 


(= - 


Ss s 
*o/27 Fa? ® + 20/27) 
is an approximate 100(1 ~ a)% confidence interval for 6. 


In particular, when a = 0.05, za/2 = 1.96 and (£+1.96s/,/n) is an approx- 
imate 95% confidence interval for @ and thus giving rise to the rule of “two 
sigma.” 


5.3.2 Simulating Option Prices 


To illustrate the ideas of standard simulations in risk management, consider 
first simulating stock prices. Let S denote the price of a stock. Recall that 
we usually assume that S follows a geometric Brownian motion 


dS = pS dt+oaS dw. 


Equivalently, 
dlog S = vdt +adW, 


where v = p—o?/2. Using the last equation and letting € to denote a standard 
normal random variable, we can generate S according to the formula 


S(t + dt) = S(t) exp(v dt + oe Vat). 
In particular 
S(T) = S(0) exp(vT + ceVT). (5.9) 


Notice that according to the risk neutral valuation principle, we usually take 
p =T, the risk free rate. 


Example 5.2 Let So = 10,u =r = 0.03,0 = 0.4, and dt = 1/52. We want 
to simulate weekly prices of the stock S;,i =1,... ,52 for a one-year period. 
Then v = p—o7/2 = —0.05 and the results are given in Table 5.4. The SPLUS 
code is as follows: 
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N <- 52 

SO <- 10 

mu <- 0.03 

sigma <- 0.4 

nu <- mu - sigma”2/2 
t <- (0:N)/N 

dt <~ 1/N 

x <- rep(0,N+1) 

y <- rep(0,N+1) 

z <- rnorm(N,0,1) 


yi] <- so 
for (i in 1:N) 
{ 


x[it1] <- sqrt (dt) *sum(z[1:i]) 

y(iti] <- y[1]*exp(nutt [i+1]+sigma*x[i+1]) 

} 

plot(t,y,type="1" ,xlab="t",ylab="Stock Price") 


Suppose we want to calculate the price of a European call option maturing 
in one year with strike price K = 12. We can use the Black-Scholes formula 
to obtain the call price C as 


C(S,t) = S8(d,) — Ke"? -9 & (da), 


where d; = sy Py (log(S/K) + (r +07/2)(T —t) and dp = dy — o/T -t. 
Substituting the values of r = 0.03,K = 12,T = 1,t = 0,0 = 0.4, and 
So = 10, we get 


dy = (log(10/12)+(0.03-+0.08)(1)) = —0.1808, dz = d,—0.4 = —0.5808. 


aa! 


Using the SPLUS command pnorm(z) to evaluate @(z), we get ®(d;) = 0.4283 
and ®(d2) = 0.2807. Hence, 


C = 10(0.4283) — 12e~°3(0.2807) = 1.013918. 
On the other hand, we can evaluate C = eT TE(Sr —K)*. 


Example 5.3 The price of the European call option can now be computed 
using simulations. 


1. First generate n independent values of 5;(T),...,S,(I) according to 
(5.9). 


2. Compute simulated discounted call prices C; = e7"? max{(S,(T) — 
K),0},¢=1,...,n 


3. Compute C = +5~_, Cj. C is an estimate of the discounted payoff 
E(Sr — K)t. 
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Week Price 

0 10.000000 
1 10.38419 
2 10.37402 
3 10.67406 
4 11.65342 
5 11.89871 
6 11.28296 
7 11.15327 
8 10.33483 
9 11.16090 
10 12.14546 
43 14.39009 
44 13.78038 
45 14.01125 
46 12.72393 
AT 13.44627 
48 13.05377 
49 12.00424 
50 12.74416 
51 12.16204 
52 12.15517 


Table 5.4 Simulated prices of the first and the last 10 weeks. 


4. Construct a 95% confidence interval for C from 


C +1.96S//n, 


where 


is the sample standard deviation of the simulated call prices C;. 


To simulate 100 paths 


p <- 100 

N <- 52 

So <- 10 

K <- 12 

mu <- 0.03 
sigma <- 0.4 
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nu <- mu - sigma~2/2 

t <- (O:N)/N 

dt <- 1/N 

x <- matrix(rep(0, (N+1)*p) ,nrow=(N+1)) 
y <- matrix(rep(0, (N+1)*p) ,nrow=(N+1)) 
for (j in i:p) 

{ 

z <- rnorm(N,0,1) 

for (i in 1:N) 

{ 

x[it1,j] <- sqrt (dt)*sum(z[1:i]) 
ylit+i,j] <- SO*exp(nust [it+1]+sigma*x[i+1,j3]) 


} 

b: 

ST <- y(N+1,] 

C <~ rep(0,p) 

for (i in 1:p) 

{ 

C{i] <- exp(-mu) *max(ST[i]-K,0) 
} 


C.bar <- mean(C) 

S <- sqrt (var(C) *p/(p-1)) 

CI <- mean(C) - 1.96*S/sqrt (p) 
CI{2] <- mean(C) + 1.96*S/sqrt(p) 


Outputs of the simulated Cjs are given in Table 5.5. The result of a 100- 
path simulation shows that the 95% confidence interval for C is (0.37, 1.83). 
Fig. 5.6 shows that when the number of runs increases, the value of C’ con- 
verges to the limit of 1.01. 


5.3.3 Simulating Option Delta 


In risk management, hedging an option is sometimes more important than 
valuing the option. When a bank issues structured financial products to en- 
hance sales, the embedded option risk would be of great concerns. Hedging is 
a useful device to manage such a risk. For a standard call option, the hedge 
ratio refers to the delta of the option, the partial derivative of the option 
price with respect to the underlying asset price. Under the Black-Scholes 
assumption, the delta of a call is defined as 


a 
delta = aa = B(d,). (5.10) 


We use simulation to calculate the hedge ratio, delta, for general European 
options. 
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Path C; 


0.0000000 
0.0000000 
0.0000000 
5.9331955, 
1.1971242 
0.0000000 
2.2395878 
0.0000000 
9 0.0000000 
10 = 4.0065595 
11 0.0000000 
12 1.3006804 
13 0.0000000 
14 ~—_-:9.0000000 
15 0.0000000 
16 0.0000000 
17 ~—- 6.0970236 
18 0.0000000 
19 0.0000000 
20 0.1768191 


CONDOR WNHH 


Table 5.5 The discounted call prices for the first 20 paths. 


The risk-neutral valuation asserts that an option with payoff F(S;) can be 
valued as e~"E|F(S7)|So = S]. Therefore, delta equals 
delta = en? SBF (Sr)|So = §]. (5.11) 


In order to compute delta under the Black-Scholes dynamics, the following 
theorem is established. 


Theorem 5.3 The delta of a European option with payoff F(Sr) is given by 
“TS Wr 

lta =e"? E|F(Sr)=— aie) 

delta = e Blr(Srgo] (5.12) 


where Wr is the standard Brownian motion driving Sr. 
Proof: Ignoring the discount factor, the definition of delta in (5.11) is 


1 O¢(z| log S’) 
af. as a . M5 dlog S a 


where 


1 (c-y-vT) 
o(zly) — con 902T | . 


Call Price Estimate 


0.95 1.00 1.05 4.10 4.15 


0.90 
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10) 5000 10000 15000 


No. of simulations 


Fig. 5.6 Simulations of the call price against the size. 


Standard differentiation shows that 


a4( ae Sealy erat 


Hence, we have 


xz—logS—v 


T 
30F $(x| log S) dz. 


co 
delta = avy F(e*) 
OO 
Recall that z = log Sr, 
xz —logS—vT = log S; —logS ~vT = oWyr. 
This completes the proof. oO 


Theorem 5.3 enables us to simulate option delta (or even gamma) as follows. 


STEP 1: Generate Z1, Z2,..., Zn, ~ N(0,1) iid. 
STEP 2: Set Y; = F (See 0? /2)T +02; VP T) 2 =a 


STEP 3: Set delta = 4 ees 
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The theorem can be extended to the case of path dependent options. However, 
the derivation requires some knowledge of Malliavin calculus, which is beyond 
of the scope of the book. For details of this generalization, we refer to the 
paper of Fournie et al. (1999). 


Example 5.4 The current price is $10, interest rate 5%, volatility 40%. Sim- 
ulate the price and delta of a call option with strike price $12 and maturity 1 
year by generating 10,000 terminal asset prices. 


An algorithm can be constructed as follows: 


STEP 1: Generate 10,000 terminal asset prices by the formula 
Sh, = Syexp [(r — 02 /2)T + oVTZ;| , 2; ~ N(0,1). 


STEP 2: For 7 = 1 to 10,000, Compute 


C; = max(Si, — K) * exp(—rT) and Del; = C; * Z;/(0VT'Sp). 
J T J J J 


STEP 3: Compute call price = mean(C;) and delta = mean(Del,). 


The corresponding SPLUS code is given as follows. 


###Define parameter###HHH# 
checki<-proc.time() #Set the initial time count 
n <-10000 

SO <- 10 

maturity <-1 

K <-12 

r <- 0.05 

sigma <-0.4 

nu <- r -sigma™2/2 

S <- rep(0,n) 

Cc <- rep(0,n) 

CO <-rep(0,n) 

Del <- rep(0,n) 


####S imulat lon###HFHHHHHHHH 

z <- rnorm(n,0,1) 

S <- SO*exp(nu*maturity+sigma*sqrt (maturity) *z) 
C <- pmax(S-K,0)*exp(-r*maturity) 
Del<-C*z/S0/sigma/sqrt (maturity) 


####Compute Call and Delta values####### 
Call <- mean(C) 
Delta <- mean(Del) 
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check2 <- proc.time() # checkpoint for method without Ito 
Call 
Delta 


####The CPU time#ttthtteeeeet 
check2-checki 


With a CPU time of 0.2 seconds, our simulation finds that the call price is 
1.075 and the delta is 0.441. The Black-Scholes call price and the delta are 
1.08 and 0.448, respectively. This demonstrates the efficiency and accuracy 
of the simulation algorithm. 

One thing we have to stress is that Theorem 5.3 is very useful for simulating 
deltas of single asset European options, with arbitrary payoff F(S;). However, 
it may not be applicable for path dependent options and multi-asset options. 
Therefore, we shail introduce alternative methods in later chapters. 


5.4 EXERCISES 


1. Write the SPLUS code for the newsboy problem of Section 5.2. 


2. Suppose the asset return follows the t-distribution with 2 degrees of 
freedom. Write a SPLUS code to simulate the 95% confidence VaR with 
parameters given in Example 5.1. Compare your result with the one 
obtained by normal VaR. 


3. Implement the rejection method for generating GED when v = 1.4. 
A. Verify (5.5) and (5.6). 
5. Verify (5.7) and (5.8). 


6. Let So = 100,u = r = 0.05,0 = 0.3. Use the geometric Brownian 
motion method to simulate 20 daily prices of the stock S;,i = 1,... ,20. 


(a) Suppose you want to determine the price of a European put option 
maturing in 20 days with a strike price K = 100. Use simulation 
techniques to estimate this price. 


(b) Compare your result with the one obtained from the Black-Scholes 
formula. Are they similar? 


7. The gamma of an option is defined as 


O(delta) 
aS - 
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(a) What is the financial interpretation of the gamma? 
(b) By modifying the proof of Theorem 5.3, show that 


“a 2 
gamma =e" E (St) gaep (3 —-Wr- =)| , 
o o 


(c) Construct and implement a simulation algorithm to compute the 
call option gamma with a SPLUS code. 

(d) Suppose S = 10, K = 12,r = 0.1,0 = 0.3, and T = 0.8. Compare 
your simulation result with the closed-form solution: 


1 
———— ex 
So2T W204 e | 20°T 


5.5 APPENDIX 


The data comprise 2,264 daily rates of returns. These data are transformed 
into standardized returns by using the sample mean and standard derivation. 
We assume that standardized returns follow a GED distribution with parame- 
ter €. Our goal is to estimate €. The density function of the GED distribution 
is given in (5.4). Hence, the likelihood function is 


2,264 € exp (-3 |2:/21°) 


L = Se NS 
(6) I AWM+VET(L/E) ? 
Avie eeeraley) 
(3/6) 
where Z;,... , Z2.264 are standardized returns. Instead of deriving the max- 


imum likelihood estimation (MLE) theoretically, we search the maximum 
point of the likelihood function with a numerical method. To confine the 
target point in a small interval, we plot the likelihood function against the 
parameter €. In Fig. 5.7, we recognize that a unique maximum appears for 
€ € (1,1.3). The SpLus code of the likelihood function and the plot is given 
after this paragraph. We then use the bisection method to search for the so- 
lution. Specifically, we compare L(1) and L(1.3) and discard the smaller one. 
The next step compares the remaining one with D(1.15), the functional value 
at the mid-point of 1 and 1.3. We discard the point with a smaller value in 
L and repeat the procedure until a sufficiently accurate solution is obtained. 
Ultimately, € = 1.21, which has been input to generate GED-VaR in Section 
5.2.3. 
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loglikelihood<-function(Z,xi) f 

L <- 0 

lambda <- sqrt( 2° (-2/xi)*gamma(i/xi)/gamma(3/xi) ) 
for (i in 1:length(Z)){ 

tempi <- log(xi)-abs(Z[i] /lambda) “xi/2 

temp2 <- log(lambda)+log(2)*(1+1/xi)+lgamma(1/xi) 
L <- L + tempi - temp2 

} 

L 

} 

### input returns from a text file ### 

Z <- read.table("table.txt") 

Z <- Z[,1] 

Z <- (Z-mean(Z))/sqrt (var(Z)) 

print (2) 

a<-0 

XI <- (50:250)/100 

for (i in 1:length(XI)){ 

ali] <- loglikelihood(Z,xI{iJ]) 

} 


plot(XI,a, type=’1’ ,ylab=’log likelihood’) 
print (XI [which (max (a)==a)]) 


-3200 3100 


-3300 


log likelihood 
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Fig. 5.7 The log likelihood against €. 
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Variance Reduction 
Techniques 


6.1 INTRODUCTION 


In standard Monte Carlo, we estimate the unknown quantity 6 = EX by 
generating random numbers X1,...,Xn and use X, to estimate @. Recall 
that in the preceding chapter, the standard error for X, is ¢/./n, where o? is 
the variance of X. There are two sources of contributions to the standard error 
of estimation. One is the factor 1/\/n, which is intrinsic to the Monte Carlo 
method, and not much can be done about it. The other one is the standard 
error o of the output X, which by some techniques, can be improved upon. 
There are usually four standard methods to reduce o: 


1. Antithetic Variables 
2. Control Variates 

3. Stratification 

4. Importance Sampling 


We shall discuss each of these methods in the subsequent sections. 


6.2 ANTITHETIC VARIABLES 


The idea of antithetic variables can best be illustrated by considering a special 
example. Suppose we want to estimate 9 = EX by generating two outputs 
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X, and X2 such that EX; = EX, = 6 and VarX; = VarX2 = o?. Then 


1 1 
Var(5 (1 +Xo)) = qivarxi + VarX2 + 2Cov(X1, X2)) 


ot 
raze ey tg Covl(%1, X2) 


o2 


< ee if Cov(X;, X2) < 0. 


Note that when X, and X2 are independent, then Var((X) + X2)/2) = 07/2. 
Thus, the above inequality asserts that if X, and X2 are negatively correlated, 
then the variance of the mean of the two would be less than the case when 
X, and X»2 were independent. 

How do we generate negatively correlated random numbers? Suppose 
we simulate U;,...,Um, which are uniform random numbers. Then Vi = 
1 — W,...,Vm = 1-— Um would also be uniform random numbers with 
the property that (U;,V;) being negatively correlated (exercise). If X, = 
h(U,,...,Um), then X2 = h(Vi,..., Vm) must have the same distribution as 
X,. It turns out that if h is a monotone function (either increasing or decreas- 
ing) in each of its arguments, then X; and X2 are negatively correlated. This 
result will be proved later at the end of this section. Thus, after generating 
U,,...,Um to compute X1, instead of generating another new independent 
set of Us to compute X2, we compute X2 by 


Xo =h(Vi,...,Vm) =h(1—Ui,... ,1 —Um). 


Accordingly, (X1 + X2)/2 should have smaller variance. 

In general, we may generate X; = F—1(U;) using the inverse transform 
method. Let Y; = F~!(V;). Since F is monotone, so is F~+ and, hence, X; and 
Y; will be negatively correlated. Both X1,...,Xn and Yj,... ,Y, generated 
in this way are i.i.d. sequences with c.d.f. F', but negatively correlated. 


Definition 6.1 The Y; sequence is called the sequence of antithetic variables. 


For normal distributions, generating antithetic variables is straightforward. 
Suppose that X; ~ N(yu,07), then Y; = 2u—X; also has a normal distribution 
with mean yz and variance o? and X; and Y; are negatively correlated. 

More generally, if we want to compute E(H(X)) for some function H, stan- 
dard Monte Carlo suggests using 4 yo, H(X%;). Then an antithetic estimator 
of E(H(X)) is 


Baw = = Y(H(%) + (0), 
i=l 


where Y; is a sequence of antithetic variables. To see how variance reduc- 
tion is achieved by using this antithetic estimator, let Var(H(X)) = o? and 
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Corr(H(X), H(Y)) = p. Consider 
a 1 
Var(Haw) = a5 S {Var (Xi) + VarH(Y;) + 2Cov(H(X;), H(¥;))} 
a" i=1 
1 
=> Gyp ene” + 2npo”) 


o 
2n 


I 


(1 +p). 


Note that when H(X) and H(Y) are uncorrelated (9 = 0), then the vari- 
ance would be reduced by a factor of 2, which is equivalent to doubling the 
simulation size. On the other hand, if p = —1, then the variance would be 
reduced to zero. As long as p is negative, some form of variance reduction 
can be achieved. An obvious question is that in view of this observation, why 
not choose Y so that p = —1? Such Ys may be difficult to construct since p 
represents the correlation between H(X) and H(Y). In the case H(X) = X, 
then H An reduces to a constant, which is the perfect scenario. In view of 
these caveats, we usually choose the antithetic variables Y so that p is nega- 
tive, not necessarily —1. When 4H is linear, such as the case H(X) = X, the 
antithetic variable works best. In general, the more linear the H is, the more 
effective the antithetic variable is. 


Example 6.1 Let 6 = E(e’) = te e* dz. 


We know that 9 = e — 1. Consider the antithetic variable V = 1 — U. Recall 
that the moment generating function of U equals E(e’Y) = (e — 1)/t. Now 


Cov(e’,e”) = E(eYe”) — E(e”)E(e”) 
E(eVe!-¥) — E(e”)E(e!-¥) 
e — (e — 1)? = —0.2342. 


Furthermore, 
Var(e’) = E(e?”) — (E(e¥))? = (e? — 1)/2 — (e — 1)? = 0.242. 
Thus, for U; and U2 to be independent uniform (0,1) random variables, 
Var[(eY? + e42)/2] = Var(e’)/2 = 0.121. 
But 
Var[(eU + e”)/2] = Var(e”)/2 + Cov(e¥, e”)/2 = 0.121 — 0.2342/2 = 0.0039, 


achieving a substantial variance reduction of 96.7%. | 
We are now ready to justify the argument used in advocating antithetic 
variables. 
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Theorem 6.1 Let X,...,Xn be independent, then for any increasing func- 
tions f and g of n variables, 
E(f(X) 9(X)) = Ef(X) Eg(X), 
where X = (X1,...,Xn). 


Proof. By mathematical induction. Consider n = 1, then 


(f(x) — f(y)) (o(@) — g(y)) = 0, for all x and y, 


as both factors are either nonnegative (x > y) or non-positive (x < y). Thus, 
for any random variables X and Y, 


(f(X)-F(Y)) (g(X)-g(Y)) = 0 implying E((f(X)—f(Y)) (g(X)-g(Y))) = 0. 
Tn other words, 


E(f(X) 9(X)) + E(F(Y) 9(¥)) 2 E(F(X) o(Y)) + EF(Y) 9(X)). 


If X and Y are independent and identically distributed, then 


E(f(X) 9(X)) = EZ (Y) 9(¥)) 


E(f(X) 9(¥)) = ECF(Y) 9(X)) = EF(Y)) E(g(X)) = E(F(X)) EC@(X)) 


so that 
E(f(X) g(X)) = E(f(X)) E(g(X)), 


proving the result for the case n = 1. Assume the result for n — 1. Suppose 
X1,...,Xn are independent and let f and g be increasing functions. Then 


E(f(X) g(X)|Xn = tn) = E(f(X1,..- Xn-1,En) 9(X1,.-- »Xn-1,In)|Xn = Ln) 
= E(f(X1,.-. Xn—1, Un) g(X1,--- »Xn—1,2n)) 

(because of independence) 

E(f(X1, soe »Xn-1,En)) E(g(X1, see Xn—1,2n)) 

(by induction hypothesis) 

= E(f(X)|Xn = Ln) E(g(X)|Xn = ti): 


IV 


Hence, 
E(f(X) 9(X)|Xn) 2 E(f(X)|Xn) E(g(X)| Xn). 


Upon taking expectation on both sides of this equation, we have 


E(f(X) 9(X)) > E[E(f(X)|Xn) E(g(X)|Xn)]. 
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Observe that E(f(X)|X,) and E(g(X)|X,,) are increasing functions of X, so 
that by the result of n = 1, we have 


E[E(f(X)|Xn) E(g(X)|Xn)] 2 E[E(f(X|Xn))] B[E(o(X|Xn))] 
E(f(X)) E(g(X)). 


This completes the proof for the case of n. | 


Corollary 6.1 If h(X1,...,Xn) is a monotone function of each of its argu- 
ments, then for a set U;,...,Un of independent random numbers, 


Cov[h(W1,.-. ,Un), A(1 — Ui, ... ,1 —Un)] < 0. 


Proof. Without loss of generality, by redefining h, we may assume that h 
is increasing in its first r arguments and decreasing in its remaining n — r 
arguments. Let 


J (Digs 6 te) = WMiissas ped —Dphigede yd Sn) 
9(Z1,.-- » Zn) —A(1—21,...,1— 2p, Up4i,--- » Ln). 


i 


It follows that both f and g are increasing functions. By the preceding theo- 


rem, 
Cov[f(U1,... Un), 9(U1,.-. ,Un)] = 0. 


That is, 


Cov[h(Uiy 268 Ups Vedas 4 Va) Rayne Vas Ur gas-s 5Un)| S 0, 
(6.1) 


where V; = 1 — Uj. Observe that since (h(Ui,... ,Un), A(Vi,... , Vn)) has the 
same joint distribution as (h(U1,... , Ur, Vezi,--- > Yn), h(i... Ve, Ur4i,--» ,Un)), 
it follows from (6.1) that 


Cov[h(U1, one yg Un), AM, sey Vn) < 0, 
proving the corollary. oO 


When is antithetic variable effective? Here are some guidelines: 


e Antithetic variables will result in a lower variance estimate than inde- 
pendent simulations only if the values computed from a path and its 
antithetic variables are negatively correlated. 


e If H is monotone in each of its arguments, then antithetic variables 
reduce variance in estimating E(H(Z1,...,Zn)). 


e If H is linear, then an antithetic estimate of E(H(Z;,... , Zn)) has zero 
variance. 


H(2) 


HZ) 


100 120 
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e If H is symmetric, that is, H(~Z) = H(Z), then an antithetic estimate 
of sample size 2n has the same variance as an independent sample of 
size n. 


Example 6.2 To illustrate some of these points, consider the simulations of 
payoff of options using antithetic variables. The function H here maps 


z— max{0, Sp exp([r — o?/2|T + oVTz) — K}. 


100 120 
100 120 


H(Z) 
60 
H(Z) 


80 100 120 
100 «120 


H(2) 
60 
H(Z) 

40 «6080 


Fig. 6.1 Illustration of payoffs for antithetic comparisons. 


In Fig. 6.1, the vertical axis is the payoff and the horizontal azis is the 
value of z, the input standard normal. All cases have r = 0.05%, K = 50, 
and T = 0.5. The top three cases have o = 0.3 and Sg = 40, 50, and 60; the 
second three cases have Sp = 50 and o = 0.10,0.20,0.30. The top three graphs 
correspond to the function H for options that are out-of-money (So = 40}, at- 
the-money (So = 50), and in-the-money (So = 60), respectively; the bottom 
three graphs correspond to low, intermediate, and high volatility for an at-the- 
money option. (The precise parameter values are given in the caption of the 
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figure.) As one would expect, increasing moneyness and decreasing volatility 
both increase the degree of linearity. For the values indicated in the figure, 
we find numerically that antithetics reduce variance by 14%, 42%, and 80% in 
the top three cases and by 65%, 49%, and 42% in the bottom three. Clearly, 
the more linear the function H is, the more effective the antithetic variable 
technique is. Oo 


Example 6.3 Fig. 6.2 plots the payoff of |S — K| on a straddle as a function 
of z. The parameter values are given in the caption. The graph shows a high 
degree of symmetry around zero, suggesting that antithetic variables may not 
be as effective as in the other cases. Numerical results here indicate that an 
antithetic estimate based on m pairs of antithetic variables has higher variance 
than an estimate based on 2m independent samples. 


payoff 
40 60 100 120 


20 


“4 2 0 2 4 
Zz 


Fig. 6.2 Payoff on a straddle as a function of input normal Z based on the parameters 
So = K = 50, o = 0.30, T = 1, and r = 0.05. 


HHH#HHHHH Part(1): Standard method ######## 
p <- 10000 

SO <- 50 

K <- 50 

t <- 0.5 

mu <- 0.05 

sigma <- 0.3 

nu <- mu ~ sigma*2/2 
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ST <- rep(0,2*p) 

C <- rep(0,2¥*p) 

P <- rep(0,2*p) 

for (i in 1:(2*p)) 

{ 

z <- rnorm(1) 

STL[i] <- SO*exp(nuxt+sigma*sgqrt (t) *z) 
C{i] <- max(ST{[i]-K,0) 
P[i] <- max(K-ST{i] ,0) 

} 

C.bar <- mean(C) 

ai <~ var(C)/(2*p) 
straddle <- C+P 

bi <- var(straddle) /(2*p) 


HHHHHHHH Part(2): Antithetics ######Ht 
ST1 <- rep(0,p) 
ST2 <- rep(0,p) 
ST <- rep(0,p) 
Ci <- rep(0,p) 
C2 <- rep(0,p) 
Cc <- rep(0,p) 
Pi <- rep(0,p) 
P2 <- rep(0,p) 
P <- rep(0,p) 
for (i in i:p) 


{ 
zi <- rnorm(1) 
z2 <- ~-2l 


ST1i[i] <- SO*exp(nu*t+sigma*sqrt (t) *z1) 
ST2(i] <- SO*exp(nu*t+sigma*sqrt (t)*z2) 
Cili] <- max(ST1if[i]-k,0) 

C2[i] <- max(ST2f[i]-K,0) 

Pifil <- max(ST1[i]-K,0) 

P2[i] <- max(ST2[i]-K,0) 

} 

C <- (C1+C2)/2 

P <- (Pi+P2)/2 

C.bar <- mean(C) 

a2 <- var(C)/p 

straddle <- C+P 

b2 <- var(straddle)/p 

a2/al 

b2/bi 
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6.3 STRATIFIED SAMPLING 


The idea of stratification is often used in sample surveys (Barnett, 1991). 
The idea lies in the observation that the population may be heterogeneous 
and consists of various homogeneous subgroups (such as gender, race, social- 
economic status). If we wish to learn about the whole population (such as 
whether people in Hong Kong would like to have universal suffrage in 2007), 
we can take a random sample from the whole population to estimate that 
quantity. On the other hand, it would be more efficient to take small samples 
from each subgroup and combine the estimates in each subgroup according to 
the fraction of the population that subgroup represents. Since we can learn 
about the opinion of a homogeneous subgroup with a relatively small sample 
size, this stratified sampling procedure would be more efficient. 

In general, if we want to estimate EX, where X depends on a random vari- 
able S that takes on one of the values in {1,... ,&} with known probabilities, 
then the technique of stratification runs into k groups, with the ith group 
having S = i, letting X; be the average values of X in those runs with S =i, 
and then estimating by EX = ve , E(X|S = 1)P(S =i) by 


This is known as stratified sampling. 

To illustrate this idea, suppose we want to estimate E(g =fo9 (x) dx. 
Consider two estimators based on a sample of 2n runs. cee ee one is the 
standard method, 


Note that E(g) = E(g(U)) and 


1 
Var(g ea 1s Vata (Ui)) = Rif 9% x) dx — (f g(x) dx)*). 


On the other hand, we can write 


1/2 1 
E(9(U)) = | g(x) de + | 9a 


Instead of selecting Us from [0,1], we can select the first n Us from [0,1/2] 
and the remaining n Us from [1/2,1] to construct a new estimator 


2n 


9.= sl 9/2) + Do o(Wi+)/2)} 
i=1 


w=ntl 
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It can be easily seen that if U ~ U(0,1), then V =a+(b—a)U is ese ae 
as uniform (a,b). In particular, U/2 ~ U(0,1/2) and (U +1)/2 ~ U(1/2,1). 
To compute the variance of the new estimator, — 


Var(gs = a [ovata /2))+ s Var(g((Ui yay}. 


j=ntl 


Direct computations show that if U; ~ U(0,1), then 


U; 1/2 i 
Varta) = 2 f° ge)de~ ami, 


1 

)) = af ge)de — am}, 
1/2 

where m; = fy He x) dx and m2 = lie g(x) dx. Now 


Var(o( )) + Var(g9( 


U;+1 : 
; )=2 / g° (x) dx — 4(m? + m3). 


Consequently, 


: i 
var(d.) = se { f oPte)de — 26m +g) 
Note that 
(m, + m2)? + (my — m2)? = 2(m? + m3). 
Therefore, 


Var(Gs) = = ae g° (x) dx — (my + me)? ~— (my - ma)*} 


\ 


. 1 
= Var(g) — 5, mt — my)’. 


Since this second term is always non-negative, stratification reduces the vari- 
ance by an amount of this second term. The bigger the difference in m; and 
mg, the greater the reduction in variance. In general, if more strata are in- 
troduced, more reduction will be achieved. One can generalize this result to 
the multi-strata case, but we will omit the mathematical details here. 


Example 6.4 Consider again 0 = E(e’) = foe * dr. 


Recall that by standard Monte Carlo with n = 2, 


1 
g 7 5(e"! +e), 


STRATIFIED SAMPLING 99 
and Var(g) = 0.121. On the other hand, using stratification, we have 
1 
Go = Flee? + at, 


and Var(9,)= Var(g) —(m1 — m2)?/2, where m; = fy 1/2 ox dy = e/2 — 1 and 
m2 = ae e* dz =e —e'/2. Thus, 


Var(gs) = 0.121 — (2e1/? — e — 1)?/2 = 0.0325, 


resulting a variance reduction of 73.13%. Oo 
Stratified sampling is also very useful to draw random samples from desig- 
nated ranges. For example, if we want to sample Z,,... , Zi99 from a standard 


normal distribution, the standard technique would partition the whole real line 
(—o0, 0) into a number of bins and sample Zs from these bins randomly. In 
such a case, it is inevitable that some bins may have more samples while other 
bins, particularly those near the tails, may have no sample at all. Therefore, 
a random sample drawn this way would under represent the tails. While this 
may not be a serious issue in general, it may have severe effect when the tail is 
the quantity of interest, such as the case in the simulation of VaR. To ensure 
that the bins are regularly represented, we may generate the Zs as follows. 
Let 


VY, = ee eeeereee 

j= Vit G-D) b= 1)... 100, 

mete U; ~ U(0,1) iid. me the property of uniform distribution, V; ~ 
U (433) 7p): Now let Z; = @-1(V,). Then Z; falls between the i — 1 and 


i percentiles of the standaud normal distribution. For example, if i = 1, 
then V = U/100 ~ U(0,1/100) so that Z = @-1(V) falls between 6~1(0) = 
—oo and ~1(0.01), ie., the Oth and the 1st percentile of a standard normal 
distribution. 

This method gives equal weight to each of the 100 equiprobable strata. 
Of course, the number 100 can be replaced by any number that is desirable. 
The price we pay in stratification is the loss of independence of the Zs. This 
complicates statistical inference for simulation results. 


Example 6.5 As an illustration of stratification, consider simulating stan- 
dard normal random numbers via standard method and stratification method, 
respectively. As can be clearly seen from Fig. 6.3 and Fig. 6.4, stratified sam- 
pling generates samples much more uniformly over the range than standard 
Monte Carlo. The SPLUS codes for these simulations are given as follows. O 


H##HHHHHH Standard method ######## 

N <- 500 

Ui <- runif(N) 

X1 <- qnorm(U1) 

hist (Xi, freq=F ,xlim=range(-3.5,3.5) ,nclass=50) 
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HHHHHHHH Stratified Sampling ######## 

U2 <- runif(N) 

i <- 0: (N-1) 

V <~ (U2+i) /N 

X2 <- qnorm(V) 

hist (X2,freq=F ,xlim=range(-3.5,3.5) ,nclass=50) 


OQ 
5 ee ee 
-2 oO 2 


xt 
Fig. 6.3 Simulations of 500 standard normal random numbers by standard Monte 


Carlo. 
z 1 , q 1 
SS T ae | 
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Fig. 6.4 Simulations of 500 standard normal] random numbers by stratified sampling. 


Example 6.6 As a second illustration of stratification, consider the simula- 
tion of a European call option of Example 5.2 again. 
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In Example 5.2, we simulate the terminal prices $|(T),... ,S,(T) according 
to (5.9) and then compute the estimate as 


—rT ® 


S- max{$;(T) — K, 0}. 
i=1 


e€ 


C= 
n 


In this standard simulation, the random normals are samples arbitrarily over 
the whole real line. We can improve the efficiency by introducing stratification. 


1. Partition (—oo, co) into B strata or bins. 


2. Set Vi = 4(Ui+ (i—1)), i=0,..., B and generate the desired number 
of random samples (Ng, say) of Vs in the 7th bin. 


3. Apply ®-1(V;) to get the desired normal random numbers from each 
bin and calculate C;, from each bin. 


4. Average the C; over the total number of bins to get an overall estimate 
C. 


5. Calculate the standard error as in the previous cases. 


This numerical example uses Sg = 10, K = 12, r = 0.03, o = 0.40, 
and T = 1. The theoretical Black-Scholes price is 1.0139. We simulate the 
European option price for different bin sizes with Ng x B = 1,000 in all cases. 
The effect of stratification increases as we increase the number of bins. The 
SPLUS code and the results (Table 6.1) are as follows. g 


n <- 1000 # total sample size 
B <- 100 #no. of bins 


NB <- n/B 

So <- 10 

K <- 12 

mu <- 0.03 

Sigma <- 0.4 

nu <- mu-sigma*2/2 
t<-1 

u <- 0 

z<- 0 


ST <- rep(0,NB) 

Ci <- rep(0,NB) 
Ci.bar <- 0 

varr <- 0 

for (i in 0:(B-1)) 
{ 

u <- runif (NB) 

z <- qnorm( (uti) /B) 
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for (j in 1:NB) 

{ 

ST{j] <- SO*exp(nu*t+sigma*sgrt (t) *z[j]) 
Cilj] <- exp(-mu*t) *max(ST[j]-K,0) 
} 

Ci.bar <- Ci.bar + mean(Ci) 

varr <- varr + var(Ci) 

} 

C <- Ci.bar/B 

SE <- sqrt (varr/NB)/B 

c 

SE 


Bins (B) Ng Mean (C)_ Std. Err. 

1 1000 0.9744 0.0758 
2 500 1.0503 0.0736 
5 200 1.0375 0.0505 
10 100 0.9960 0.0389 
20 50 0.9874 0.0229 
50 20 1.0168 0.0146 
100 10 0.9957 0.0092 
200 5 1.0208 0.0094 
500 2 1.0151 0.0062 
1000 1 1.0091 NA 


Table 6.1 Effects of stratification for simulated option prices with different bin sizes. 


Regular stratification puts equal weight on each of the B bins. Such an 
allocation may not be ideal as one would like to have sample sizes directly 
over that bin. To illustrate 
this point, consider the payoff of a European call option again. 


related to the variability of the target function 


Example 6.7 Stratified sampling for a European call with the same parame- 


ter values as in Example 6.5. 


We know that if Sp < K, then the payoff of the 


call is zero. Recall 


Sr a Spel 2 /AT+OVTZ 
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Therefore, Sp < K iff Spel"-7 /2T +evVTZ < K. That is, 
Z < flog(K/So) — (r ~ 0?/2)T\/(oVP) := L. 


Every simulated Z < L is being wasted as it just returns the value 0. We 
should only be concentrating on the interval [LZ, 00). How can we achieve this 
goal? 


1. Find out the c.d.f. of a normal distribution Y restricted on [L,0o). It 
can be shown that Y has c.d.f. 


_ &y) — OL) 


2. Use the inverse transform method to generate Y. Consider the inverse 
transformation of F, i.e., solve for y such that y = F—}(x). Writing it 


out, we have x = F(y) = ee so that 


y = &1(2(1 — &(L)) + ®(L)). 
Now generate U from uniform (0,1) and evaluate 
Y = 6 '(U(1— G&(L)) + (L)). 
3. Plug in the generated Y into the simulation step of the payoff of the call 


and complete the analysis. Note that when evaluating the new estimator 
for the payoff, we need to multiply the factor 1 — ®(L). That is, 


C* = (1- ®(L))6, 


where C is the average of the simulated payoffs using the truncated 
normal random variables. 


In general, we would like to apply the stratification technique to bins in 
which the variability of the integrand is largest. Here, we just focus the entire 
sample on the case Sy > K. a] 

The SPLUS code and the output (Table 6.2) are given as follows: 


n <~ 1000 # total sample size 
B <- 100 #no. of bins 


NB <~ n/B 

so <- 10 

K <- 12 

mu <- 0.03 
sigma <- 0.4 


nu <~ mu-sigma™2/2 
t<- i 
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u <- 0 

z<- 0 

ST <- rep(0,NB) 

Ci <- rep(0,NB) 

Ci.bar <- 0 

varr <- 0 

L <- {log(K/S0)-(mu-sigma*2/2) *t}/(sigma*t) 
for (i in 0:(B-1)) 


{ 
u <- runif (NB) 
v <- (uti)/B 


z <~ qnorm(v*(1~pnorm(L) )+pnorm(L) ) 

for (j in 1:NB) 

{ 

ST[j] <- SO*exp(nu*t+sigma*sqrt (t)*z[j]) 
Cilj] <- exp(-mu*t) *max(ST[j]-K,0) 

} 

Ci.bar <- Ci.bar + mean(Ci) 

varr <- varr + var(Ci) 

S 

C <- (1-pnorm(L))*Ci.bar/B 

SE <- sqrt (varr/NB) /B 

Cc 

SE 

## The theoretical Black-Scholes value is 1.0139. 


6.4 CONTROL VARIATES 


The idea of control variates is very simple. Suppose we want to estimate 
@ = EX from the simulated data. Suppose that for some other variable Y, 
the mean wy = EY is known. Then for any given constant c, the quantity 


Xov =X4+c(Y - py) 


is also an unbiased estimate of 6 since E(Xcy) = 6. Presumably, if we choose 
the constant c cleverly, some form of variance reduction can be achieved. How 
can we do this? In other words, what would be a good choice of c? To answer 
this question, first consider the variance of the new estimator Xoy, call it 


2 
Toy: 


Oey = Var(X + c(¥Y — py)) = VarX + c?VarY + 2cCov(X,Y). 
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Bins (B) Ns  Mean(C) Std. Err. Adj. Mean SE 


1 1000 0.9744 0.0758 0.9842 0.1102 
2 500 1.0503 0.0736 1.0303 0.0823 
5 200 1.0375 0.0505 1.0235 0.0524 
10 100 0.9960 0.0389 1.0101 0.0404 
20 50 0.9874 0.0229 1.0058 0.0238 
50 20 1.0168 0.0146 1.0147 0.0153 
100 10 0.9957 0.0092 1.0089 0.0095 
200 5 1.0208 0.0094 1.0160 0.0099 
500 2 1.0151 0.0062 1.0143 0.0066 
1000 1 1.0091 NA 1.0125 NA 


Table 6.2 Effects of stratification for simulated option prices with restricted normal. 


We would like to find c such that a2, is minimized. Differentiate the preced- 
ing expression with respect to c and set it equal to zero, we have 


2cVarY + 2Cov( X,Y) = 0. 


Solving for such a c, we get, c* = ~Cov(X,Y)/VarY as the value of c that 
minimizes o2,,. For such a c*, 


: Cov?(X,Y) 
= VarY 


The variable Y used in this way is known as a control variate for the sim- 
ulation estimator X. Recall that Corr(X,Y) = Cov(X,Y)/(VarX VarY)!/?. 
Therefore, 


o2. = VarX(1 — Corr?(X,Y)). 


Hence, as long as Corr(X, Y) 4 0, some form of variance reduction is achieved. 
In practice, quantities like ¢? = VarY and Cov(X,Y) are usually not avail- 
able, they have to be estimated from the simulations based on sample values. 
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For example, let X = )77_, Xi/n and Y = S77, Yi/n. Then 


Cov(X,Y) = Sie RAP: 
t=1 
2 ly m2 
oy = a 
tie aes _Cov(X,Y) 
oy 


Suppose we use X from simulation to estimate @. Then the control variate 
would be Y and the control variate estimator is 


X+c(Y — py), 
with variance equaling to 
1 Cov (X.Y). a4 
(VarxX. =. SAT = pe), 
= as VarY ) n OP 


Equivalently, one can use the simple linear regression equation 


X=a+b¥ +e, erniid. (0,07), (6.2) 
to estimate c*. In fact, it can be easily shown that the least squares estimates 
of b, b = —é*, see Weisberg (1985). In such a case, the control variate estimator 
is given by 

X+ct(¥ -— py) =X —b(¥ - py) = 4+ bpy, (6.3) 


where @ = X — bY is the least squares estimate of a in (6.2). That is, the 
control variate estimate is equal to the estimated regression equation evaluated 
at the point py. 

Notice that there is a very simple geometric interpretation using (6.2). 
First. observe that the estimated regression line 


x a+bY 


= X+b6(Y-Y). 


iI 


Thus, this line passes through the point (Y,X). Second, from (6.3), 
Koy = 4+ buy = X - 1(Y — py). 


Suppose that Y < py, that is, the simulation run underestimates wy and 
suppose that X and Y are positively correlated. Then it is likely that X 
would underestimate E(X) = 6. We therefore need to adjust the estimator 
upward and this is indicated by the fact that b = —é* > 0. The extra amount 
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that needs to be adjusted upward equals —b(Y — py), which is governed by 
the linear equation (6.3). 

Finally, 2, the regression estimate of o? is the estimate of Var(X — bY) = 
Var(X + é*Y). To see this, recall from regression that 


n 
Ee 1 P 
62 2 


l< Bods E 

=< 2% — X)+b6(Y, -Y))? 

= - ((X; — %)? — 8(¥; — bar?) 

i=l 

= Var(X) — b?Var(Y) 

= Var(X - bY). 
The last equality follows from a standard expansion of the variance estimate 
(see exercise 6.2). It follows that the estimated variance of the control variate 


estimator X + é(¥ — py) is o2/n. 
Example 6.8 Consider the problem 6 = E(e”) again. 
Clearly, the control variate is U itself. Now 

Cov(eY,U) = E(Ue’) — E(U)E(eY) 
[ xe” dx — (e —1)/2 
i (e — 1)/2 = 0.14086. 


i 


The second last equality makes use of the facts from the previous examples 
that E(U) = 1/2, VarU = 1/12, and Var(e’) = 0.242. It follows that the 
control variate estimate has variance 


Var(eU + c*(U — 1/2)) = Var(eY)(1 ~ 12(0.14086)?/0.242) = 0.0039, 


resulting a variance reduction of (.242 — .0039)/.242 x 100% = 98.4%. Oo 
In general, if we want to have more than one control variate, we can make 
use of outputs from the multiple linear regression model given by 


k 
X=a+ So bY; +e, eniid. (0,07). 


i=1 
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In this case, the least squares estimates of a and bjs, @ and bis can be easily 
shown to satisfy cf = =b, i= 1,...,k. Furthermore, the control variate 
estimate is given by 


k 


k 
X+S(¥- i) = A+ 2 bit 


i=l 


where E(Y;) = wi, 7 = 1,...,k. In other words, the control variate esti- 
mate is equal to the estimated multiple regression line evaluated at the point 
(41,---, Hk). By the same token, the variance of the control variate estimate 


is given by o?/n, where 6? is the regression estimate of o?. 


Example 6.9 Plunging along the same line, consider simulating the vanilla 
European call option as in Example 6.5, using the terminal value Sy as the 
control variate. 


The control variate estimator is given by 
Coy = C+c*(Sr — E(Sr)). 
Recall Sp = Spe” T+oVTZ ) it can be easily deduced that 
E(Sr) = Soe", (6.4) 
Var(Sp) = SteT(e?T — 1). (6.5) 
The algorithm goes as follows: 
1. Fori=1,...,Nj, simulate a pilot of N; independent paths to get 
BOS <8) eT tovTZ; 
C(i) = e-"? max{0, Sp(i) — K}. 


2. Compute E(Sr) as a or estimate it by Die i orl) /N;. Compute 
Var(Sr) as S2e2T (eo T — 1) or estimate it by resi : Spe (Sr (i) — Sr)?. 
Now estimate covariance by 


Sr)(C(i) - C), 


Cov(Sr,C) = = d 


where C = )<™?, O(i)/Ny and Sp = 37%, Sr(i)/M. 
3. Repeat the simulations of Sr and C by means of control variate. For 
i= 1,...,No, independently simulate 
Sr(t) 
C(i) = eT? max{0, Sp(i) — K}, 
Cev (i) C(t) + e*(Sr(i) — E(Sr(i))), 


vT+oVTZ; 
Soe ’ 
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where c* = —Cov(S7,C)/VarSr is computed from the preceding step. 


4. Calculate the control variate estimator by 
ee 
We » (i) 


Complete the simulation by evaluating the standard error of Ccy and 
construct. confidence intervals. 


Here is the SPLUS code and output. Oo 


Ni <- 500 

N2 <- 50000 

SO <- 10 

K <- 12 

r <- 0.03 

Sigma <- 0.4 

nu <- r-sigma*2/2 
t<- 1 

ST <- rep(0,N1) 
C <- rep(0,N1) 
ST2 <- rep(0,N2) 
C2 <- rep(0,N2) 
CCV <- rep(0,N2) 


for (i in 1:N1) 

{ 

z <- rnorm(1) 

ST({i] <- SO*exp(nu*t+sigma*sqrt (t) *z) 

C[i] <- exp(-r*t)*max(ST{i]-K,0) 

} 

ST.bar <- SO*exp(r*t) 

VarST.hat <- SO°2*exp(2*r*t)*(exp(sigma”2*t)-1) 
#ST.bar <- mean(ST) 

C.bar <- mean(C) 

Cov.hat <- sum((ST-ST.bar)*(C-C.bar))/(N1-1) 
#VarST. hat <- sum((ST-ST.bar)~*2)/(N1-1) 

c <- -Cov.hat/VarST.hat 


for (i in 1:N2) 

{ 

z <- rnorm(1) 

ST2{i] <- SO*exp(nu*t+sigma*saqrt (t) *z) 
C2[i] <- exp(-r*t) *max(ST2[i]-K,0) 
ccv{i] <- c2[i]+c*(ST2[i]~ST. bar) 
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} 
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CCV.bar <- mean(CCV) 

Var .CCV <- sum((CCV-CCV. bar) ~2)/(N2-1) 
SE <- sqrt (Var.CCV) 

CI <- CCV.bar-1.96+*SE/sqrt (N2) 

CI[2] <- CCV.bar+1.96*SE/sqrt (N2) 

CCV. bar 


cI 


For N; = 500 and N2 = 50,000, we have a 95% confidence interval for 
Cov of [1.0023 1.0247]. In this case, the estimated call price is 1.0135 with 
standard error 0.0057. 

In using control variates, there are a number of features that should be 
kept in mind. 


6.5 


What should constitute the appropriate control? We have seen that in 
simple cases, the underlying asset prices may be appropriate. In more 
complicated situation, we may use some easily computed quantities that 
are highly correlated with the object of interest as control variates. For 
example, standard calls and puts frequently provide convenient source 
of control variates for pricing exotic options, and so does the underlying 
asset itself. 


The control variate estimator is usually unbiased by construction. Also, 
we can separate the estimation of the coefficients (€}) from the estima- 
tion of prices. 


The flexibility of choosing the cjs suggests that we can sometimes make 
optimal use of information. In any event, we should exploit the specific 
feature of the problem under consideration, rather than generic appli- 
cations of routine methods. 


Because of its close relationship with linear regression, control variates 
are easily computed and explained. 


We have only covered linear control. In practice, one can consider using 
nonlinear control variates, for example, XY /py. Statistical inference 
for nonlinear control may be tricky though. 


IMPORTANCE SAMPLING 


After studying three variance reduction methods, we will pursue one last 
method, namely, importance sampling. This method is similar in idea to 
the acceptance-rejection method that was discussed in Chapter 4. Its main 
idea lies in approximating at places where the quantity of interest carries 
the most information, hence the name of importance sampling. This chapter 
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then concludes with examples illustrating the different methods of variance 
reduction in risk management. 
Suppose that we are interested in estimating 


=BIM(X)] = [ h(e)f(e) dz, 


where X = (Xj,...,X,) denotes an n-dimensional random vector having a 
joint p.d.f f(x) = f(x1,...,2%n). Suppose that a direct simulation of the 
random vector X is inefficient so that computing A(x) is infeasible. This 
inefficiency may be due to difficulties encountered in simulating X, or the 
variance of h(a) being too large, or a combination of both. 

Suppose there exists another density g(a), which is easy to simulate and 
satisfies the condition that f(#) = 0 whenever g(x) = 0. Then @ can be 
estimated by 


a) 
| 
Az) 
= 
a 


[rete 
g 
= ee | 


where the notation E, denotes the expectation of the random vector X taken 
under the density g, ie., X has joint p.d-f. g(x). It follows from this identity 
that @ can be estimated by generating X with density g and then using as the 
estimator the average of the values of h(X)f(X)/g(X). In other words, we 
could construct a Monte Carlo estimator of 6 = E(h(X)) by first computing 
iid. random vectors X; with p.d.f. g(X), then using the estimator 


1S MX DSK) 
Per Digere «uae 


If a density g(a) can be chosen so that the random variable h(X) f(X)/g(X) 
has a small variance, then this approach is known as the importance sampling 
approach and can result in an efficient estimator of 0. 

To see how it works, note that the ratio f(X)/g(X) represents the like- 
lihood ratio of obtaining X with respective densities f and g. If X is dis- 
tributed according to g, then f(X) would be small relative to g(X) and there- 
fore when X is simulated according to g, the likelihood ratio f(X)/g(X) will 
usually be small in comparison to 1. On the other hand, it can be seen that 


c= [ BEaeree= | tee 


Thus, even though the likelihood ratio f(X)/g({X) is smaller than 1, its mean 
is equal to 1, suggesting that it occasionally takes large values and results in 
a large variance. 
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To make the variance of h(X)f(X)/g(X) small, we arrange for a density 
g such that those values of X for which f(X)/g(X) is large are precisely the 
values for which h(X) is small, thus making the ratio h(X)f(X)/g(X) stays 
small. Since importance sampling requires h to be small sometimes, it works 
best when estimating a small probability. Further discussions on importance 
sampling and likelihood method are given in Glasserman (2003). 


Example 6.10 Consider the problem 6 = E(U°). 


Suppose that we use the standard method 6 = 4S, U?, then we oversample 
the data near the origin and undersample the data near 1. It is easy to 
compute that 


Var(6) = {EU — (BU*)?} 


7 re ey ee 
n 


(7 36) ~ n 


Now, suppose we use the importance sampling, putting more weights near 1. 
Let g(x) = 524 for0 <a <1. Then 

X6. 5 _ EX 

> uae a 


nr 


0; = Eg( 
The variance of this method is 


Var(61) 


ae {BoX? - (Ep X)?} 
1 1 
= xg tf 6r')a—(f) 2062") az)*} 


1 : 6 : 5 2 
1 5 Bio 

= spate ere 
0.00794 


bf 


n 


resulting a variance reduction of 98.74%. a) 
How do we choose g in general? This requires the notion of the so-called 

tilted density. Recall the notation that M(t) = E(e*) represents the moment 

generating function (m.g.f.) of the random variable X with density f. 


Definition 6.2 A density function 


is called a tilted density of a given f, -co<t<o. 


Note that from this definition, a random variable with density f; tends to be 
larger than the one with density f when t > 0, and tends to be smaller when 
t<0. 
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Example 6.11 Let f be a Bernoulli density with parameter p. Then f(z) = 
p™(1—p)!-*, z =0,1. In this case, the m.g.f. is M(t) = E(e’*) = pe'+(1—p) 
so that 


fie) = ques) 


= a loe')*(1— 7)? 


M(t) 
t 
pe Lp _ 
= (=~) *)'. 
pe-+1l—p’ ‘peb+l—p 


Thus, the tilted density f, is a Bernoulli density with parameter p, = pe’ /(pe’+ 
1—p). oO 


In many instances, we are interested in sums of independent random vari- 
ables. In these cases, the joint density f(x) of x = (x1,... ,Z,) can be written 
as the product of the marginals f; of 2; so that 


f(x) = filzi)-++ fa(an). 


In this situation, it is often useful to generate the X; according to their tilted 
densities with a common ft. 


Example 6.12 Let X1,...,Xn be independent with marginal densities fy. 
Suppose we are interested in estimating the quantity 


6=P(S >a), 


where S = S~;._, Xi and a > S>7_, E(X;) is a given constant. We can apply 
tilted densities to estimate 0. Let I 1a >a} equal 1 if S > a and 0 otherwise. 
Then 

§ = E(I{S > a}), 


where the expectation is taken with respect to the joint density. Suppose we 
simulate X; according to the tilted density function f,;, where the value of 
t > 0 is to be specified. To construct the importance sampling estimator, note 
that h(X) = I{S > a}, f(X) = [] fi(Xi), and g(X) = [I] fei(Xi). The 


importance sampling estimator would be 
FAX) 
6=I{S> ; 
t @ 5% fi, i( Xj) 
Now fi(Xi)/fe,i(Xi) = Mi(t)e"™*, therefore, 


0 


tl 


I{S > a} |] Mi(t)e™* 


I{S>a}M(tje", (M(t) = [[4@) 
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Since it is assumed that t > 0, S > a iff e~* < e~™ and 
Hs Sale ee 


so that , 
6 < M(t)e7**. 


We now find t > 0 such that the right-hand side of the above inequality is 
minimized. In that case, we obtain an estimator that lies between 0 and 
min, M(t)e~*. It can be shown that such t can be found by solving the equa- 
tion 
E;(S) =a. 
After solving for t, it can be utilized in the simulation. To be specific, suppose 
X1,...,Xn are iid. Bernoulli trials with p = p; = 0.4. Let n = 20 and 
a= 16. Then 
6 = I{S > ase [ [wet +1—p). 
a 


Recall from the preceding example that the tilted density f,, is the p.d.f. of a 
Bernoulli trial with parameter p* = pe'/(pet +1 —p). It follows that 


20 t 


pe 
E,(S) = 20p* = > —~—_.. 
= pe-t+tl—p 


Plugging in n = 20, p = 0.4,a = 16, we have 


0.4e° 


20—_—_—_—_. = 
0.4e* + 0.6 


16, 

which leads to e* = 6. Therefore, we should generate Bernoulli trials with 
parameter 0.4e*” /(0.4e” + 0.6) = 0.8 as the g and evaluate M(t*) = (0.4e" + 
0.6)? and e~*S = (1/6)5. The importance sampling estimator is now 


6 = I{S > 16}M(t*)e"* 5 = I{S > 16}37°(1/6)°%. 
Furthermore, we know that 
6 < M(t*)e"** = 37°(1/6)' = 0.001236. 


Thus, in each iteration, the value of the importance sampling estimator lies 
between 0 and 0.001236. 

On the other hand, we can also evaluate 9 = P(S > 16) exactly, which 
equals to the probability that a Binomial random variable with parameters 20 
and 0.4 be at least as big as 16. This value turns out to be 0.000317. Recall 
the function h(X) = I{S > 16}. This is a Bernoulli trial with parameter 0 = 
0.000317. Therefore, if we simulate directly from Xs, the standard estimator 
65 has variance i . 

Var(6s) = 6(1 — 6) = 3.169 x 1074. 
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As 0 <4 < 0.001236, it can be shown that 
Var(6) < (0.001236)?/4 = 3.819 x 1077, 
which is much smaller than the variance of the standard estimator As. Oo 


Another application of importance sampling is to estimate tail probabilities 
(recall at the beginning we mentioned that importance sampling works best in 
small probability). Suppose we are interested in estimating P(X > a), where 
X has p.d.f. f and a is a given constant. Let [(X > a) = 1 if X > a and 0 
otherwise. Then 


P(X >a) = Ef(UI(X >a)) 


= LX) 

= E,[I(X > x)! 

= rics) a F 

= Byll(X > a) IX > al P(X > a) 
ee > oy Sx < a]P(X <a) 

= £ ee) fie > alP(X >a). 


aX) 


Take g(x) = Ae~**, x > 0, an exponential density with parameter 4. Then 
the above derivation shows 


P(X > a) = E,[e** f(X)|X > ale? /). 


Using the so-called “memoryless property”, i.e., P(X > s+t|X > s) = P(X > 
t), of an exponential distribution, it can be eheily seen that the conditional 
distribution of an exponential distribution conditioned on {X > a} has the 
same distribution as a+ X. Therefore, 


P(X >a) 


e7 re MX+4a) 
* Esle F(X +)] 


i} 


sole * F(X + a)}. 
We can now estimate @ by generating X,,... ,Xp according to an exponential 
distribution with parameter \ and using 

mr 


te f(Xi+a). 


Example 6.13 Suppose we are interested in 6 = P(X > a), where X is 
standard normal. Then f is the normal density. Let g be an exponential 


116 VARIANCE REDUCTION TECHNIQUES 


density with X=a. Then 


P(X >a) = ~Bgle* f(X +a)] 


Ele —+0)"/21, 


Qn 


We can therefore estimate 0 by generating X, an exponential distribution with 
rate a, and then using 


to estimate 6. To compute the variance of 6, we need to compute quantities 
E,(e~*’/?] and E,le~*’]. These can be computed numerically and can be 
shown to be 


E,le~*’/?| = ae /2/2n(1 — ®(a)), Eyle*"] = ae /4\/r(1 — &(a/V2)). 


For example, if a = 3 and n = 1, then Var(e~**/?) = 0.0201 and Var(6) = 
(Se)? x 0.0201 ~ 4.38 x 10-8. On the other hand, a standard estimator has 
variance 6(1 — 6) = 0.00134. Oo 


Consider simulating a vanilla European call option price again, using the 
importance sampling technique. Suppose that we evaluate the value of a 
deep out-of-money (So << K) European call option with a short maturity 
T. Many sampling paths result S; < K and give zero-values. Thus, these 
samples are wasted. One possible way to deal with this problem is to increase 
the values of Z;s by sampling them from a distribution with large mean and 
large variance. Sample Z; from N(R a8”) so that 


oVTZ; ~ N(m, oT’). 

Note that Z; can be written as 
Z,= +8Zi, Z,~N(0,1 
a i; a ( ): 


The importance sampling estimator is then given by 


N 
1 5 S 
Cr = aa y max{ Se" 7 / IT +eV TS: — K,0}R(Z,), 


i=1 
where 
: Lexp(—Zi- 2 3? 
Oi ss #4). 
Jaze EXP(— evs 352 (Zi — —m)*) 2 2 
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Thus, C; can be expressed as 
N m 2 

1 2 72 (—"R hs sZ;) 

= —rT > (r~0? /2)T+m+soVTZ; _ ieee oVT 

Cr=se" N De Bese Mm K,0} exp( eC hn ). 

Example 6.14 Let So = 100, K = 140, r = 0.05, og = 0.3, and T = 1. 

We simulate the value of this deep out-of-money European call option, using 

the importance sampling technique and compare it with the result of standard 
method. 


The SPLUS code is as follows: 


HHHHHHHH Part(1): Standard method ######## 
N <- 10000 

so <- 100 

K <- 140 

t<- i 

r <- 0.05 

Sigma <- 0.3 

nu <- r - sigma™2/2 


Ci <- rep(0,N) 

for (i in 1:N) 

{ 

z <- rnorm(1) 

ST <- SO#exp(nu*t+sigma*sqrt (t)*z) 
Cili] <- exp(-r«*t)+*max(ST-K ,0) 
} 

C.bar <- mean(Ci) 

SE <- sqrt(var(Ci)/(N-1)) 
C.bar 

SE 


HHHHHHHH Part (2): Importance Sampling ######## 
N <- 10000 

so <- 100 

K <- 140 

t<- 1 

r <- 0.05 

sigma <- 0.3 

nu <- r - sigma*2/2 

m <- 0.5 

s <- 1.1 


Ci <- rep(0,N) 
for (i in 1:N) 
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zi <- rnorm(1) 
z2 <- m/(sigma*sgqrt (t))+s*z1 
ST <- SO*exp(nu*t+m+s*sigma*sqrt (t)*z1) 


cilil 


} 


<- s*exp(-r*t) «max (ST-K ,0) *exp(z1°2/2-z2*2/2) 


C.bar <- mean(Ci) 
SE <- sqrt (var (Ci)/(N-1)) 
C.bar 


SE 


For N = 10,000, we have C; = 3.1202 with standard error 0.0264 using 
importance sampling while getting C = 3.0166 with standard error 0.1090 
using standard method. The result shows that the importance sampling tech- 
nique gives a more precise estimate of the price of the option, which has a 


theoretical Black-Scholes price 3.1187. 0 
6.6 EXERCISES 
1. Let U ~ U(0,1) and let a and b be two given constants with a < b. 


Show that Y = a+(b—a)U is distributed as a U(a, b) random variable. 


. Let 6 be the least squares estimate of 6 in the simple linear regression 


model X =a+bY +e, e~ (0,07) i.i.d. Show that 


Var(X — bY) = Var(X) — b°Var(Y). 


. Suppose you want to estimate 0 = ae e® dz. Show that generating 


a random number U and then using the antithetic estimator (eI + 
e!~2U)) /2 is better than generating two random numbers U; and U2 
and using the standard estimator (evi + U2 )/2. 


. Consider estimating 6 = [} 423 dz. 


(a) Using standard simulation technique, estimate 6. 


(b) Using antithetic variable technique, construct an improved esti- 
mate of 0. 


(c) Using stratification, construct another estimate of 0. 
(d) Construct a control variate estimate of 6. 
(e) Compare the performance of these different estimates. 


(f) Can you combine the above methods to improve the result? 


5. Consider 6 = f° (x — 2)e~* dz. 


EXERCISES 119 


(a) It is known that 0 = E[f(X)] where X ~Exp(1). What is f(X)? 

(b} Provide an algorithm to sample X from the interval [2, 00). 

(c) Provide an algorithm to stratify X in the interval [2, 00) with equal 
probability 1/4 for each stratified interval. 


(d) Provide a Monte Carlo algorithm using (X — 2) as the control 
variate. 


6. Redo Examples 6.6, 6.7, and 6.9 using So = K = 100,r = 0.05,0 = 0.1, 
and T = 1. Calculate the theoretical Black-Scholes price also. 


7. Verify equations (6.4) and (6.5). 


8. Consider a truncated payoff vanilla call option with maturity T and 
strike price K. The payoff function is given by 


_j Sr-K if K<S7 <p, 
h(Sr) = { 0 otherwise. 


The given constant S, acts as a barrier, canceling the option whenever 
Sr > Sp. Assuming that the stock price follows a geometric Brownian 
motion with v = r — 07/2, where the risk free rate r and the volatility 
o are known. Using the idea of antithetic variables, write a variance 
reduction algorithm to estimate the payoff function. 
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Path-Dependent Options 


7.1. INTRODUCTION 


Contingent claims other than standard call and put options are known as 
exotic options. The most common type of exotic options is path-dependent 
options. As indicated by the name, the payoff of a path-dependent option 
depends on the entire path of the underlying asset prices, not just the ter- 
minal asset price alone. According to this definition, American options are 
path-dependent options because the option holder has to determine whether 
the options are worth exercising at each time point. The path-dependent fea- 
ture of an option usually complicates the analytical tractability of valuation. 
Simulation would be the most useful alternative. 

Owing to the need to value exotic options, this chapter studies simulation 
techniques for European- and American-style path-dependent options. Some 
of the options considered in this chapter have no analytical solutions. 


7.2. BARRIER OPTION 


Barrier options have become increasingly popular nowadays. A barrier option 
is very much like a “vanilla” option, which becomes alive when the barrier 
is crossed. Let K be the strike price, T be the time to maturity and V be 
the value of the barrier. A down-and-in barrier option becomes alive only 
if the stock price (usually counting only closing prices) goes below V before 
T. A down-and-out barrier option is killed if the stock price goes below V 
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before T. A down-and-in barrier call option is a cheaper tool to hedge against 
the upside risk. From the definition, it can be easily seen that holding both 
a down-and-in and down-and-out options with the same strike price K and 
maturity T is the same as holding a “vanilla” option. Let Cg; and Cg. be the 
option value of the down-and-in call and the down-and-out call, respectively. 
Then 

Cai = Cao = C, 


where C’ is the vanilla call price. Let 


1 Smin < V, 


Smin = min S(t) and HSmin <V} = 1) Smin = V, 
min am ? 


O<t<T 
be the realized minimum asset price and the indicator of the down-and-in 
option, respectively. Then, the value of the option can be written as 


Cai = eT E{T{Smin < V}(S(T) — K)*}, 


where E denotes the risk-neutral expectation. The other types of barrier 
options can be evaluated analogously. 

To simulate the value of a down-and-in call option, the algorithm goes as 
follows: 


1. Generate the daily stock price S(t1), S(t2),... ,S(tn = T). If min; S(t;) < 
V, then set 
C =e"? max(S(T) — K,0), 


else set C = 0. 


2. Repeat step 1 N times to obtain Cy,...,Cy. The value of the down- 
and-in call option is given by 


Example 7.1 Let Sg = 10, r = 0.23, 0 = 0.4, and dt = 1/250. Compute the 
value of a down-and-in call option with strike price K = 12, maturity T = 1, 
and barrier V = 9. 


The SPLUS code is as follows: 


N <- 10000 #no. of path 


LOOKBACK OPTION 


SO <- 10 

K <- 12 

V<- 9 

Time <- 1 

r <- 0.03 

sigma <~- 0.4 

dt <- 1/250 

nu <~ r ~ sigma™2/2 


t <- (1: (1/dt)) dt 
ST <- rep(0,1/dt) 
Ci <- rep(0,N) 


for (j in 1:N) 

{ 

z <- rnorm(1/dt) 

for (i in 1:(1/dt)) 

< 

ST(Ci] <- SO*exp(nurt [i] +sigma*sqrt (dt) *sum(z[1:i])) 
- 

if (min(ST)<V) Cilj] <- exp(-r*T) *max(ST-K,0) 
else Ci[j] <- 0 

} 

C.bar <- mean(Ci) 

SE <- sqrt (var(Ci)/(N-1)) 

CI <- C.bar ~ 1.96*SE 

CI[2] <- C.bar + 1.96*SE 

C.bar 

SE 

cI 
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For N = 10,000, we get C = 1.0273 and the standard error of C is 0.02048. 


The 95% confidence interval for C’ is [(0.9872,1.0675]. 


7.3. LOOKBACK OPTION 


C] 


The payoffs of lookback options depend on the maximum or the minimum 
stock price during the life of the option. Denote the maximum (minimum) 


of the stock price over the time period [0,7] by Smaz(T) (Smin(T)). 


popular lookback options are: 
1. Floating strike lookback call (ci): payoff = Sp — Smin(T); 
2. Floating strike lookback put (psi): payoff = Smez(T) — Str; 


3. Fixed strike lookback call (cir): payoff = max(Smaz(T) — K,9); 


Four 
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4, Fixed strike lookback put (pyiz): payoff = max(K — Smin(T), 0). 


There are lookback put-call parities connecting the floating strike lookback 
call (put) to the fixed strike lookback put (call). Specifically, four put-call 
parities of lookback options are: 


1. efi (t, S, Smin(t)) = Se FS rin (t) + fiz (t, 5, Smin(t)) K = Smin(t))s 
2. pei (t, S, Smaz(t)) = e~"F- Sraz(t)—S+eria (t, S, Smaz(t); K = Smazlt)); 
3. Cia (t, S, Smaz(t); K) = S— eT -9K + pp (t, S, max(Smaa(t), K)); 

4. prix (tS, Smin(t); K) = eT -9K — S + c4 (t, S, min(Smin(t), K)). 


These four put-call parities are model-independent, meaning that they are 
applicable to any asset dynamics. For a proof, we refer to the paper of Wong 
and Kwok (2003). 

Pricing lookback options with simulation is very similar to that of the 
barrier option. Consider the floating strike lookback call option. The SPLUS 
code of Example 7.1 can be modified to obtain the lookback option price. We 
just compute 
ett N : 

y S sir) ~ min ;(t3) 


i=] 


Other lookback options are valued in the same manner. 

It is interesting to notice that simulating fixed strike lookback options re- 
quires less storage than simulating the floating ones. The reason is that payoffs 
of fixed strike lookback options do not depend on the terminal asset price, Sr. 
Therefore, after generating a sample path, only the maximum or minimum 
price of the path is required. With this observation and the lookback put- 
call parities, a storage saving approach to simulating floating strike lookback 
options can be developed. For valuing a floating strike lookback call, a fixed 
strike lookback put with strike price equaling to the realized minimum asset 
value is simulated. Then, the floating strike call price is extracted from the 
first put-call parity. 


7.4. ASIAN OPTION 


Asian options payoffs depend on the average of the underlying asset prices 
during the option life. Asian options are popular in the financial industry 
because they cost less than their vanilla counterparts and are less sensitive 
to the change in underlying asset prices. The common forms of averaging in 
option contracts can either be geometric average or arithmetic average of the 
underlying variables. Denote the geometric average and arithmetic average of 
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the underlying asset in the period [0,T] by Gr and Ar, respectively. Then, 
n z 17? 
Gr = lim I S(ti)| = exp | [ log S(t) dt} , (7.1) 


ons Hee ae faa 
Ar = ee [ S(t) dt. 


For geometric Asian options, analytical pricing formulas are available in the 
literature, see, for example, Wong and Cheung (2004). However, almost all 
Asian options are traded with arithmetic average. For instance, two frequently 
traded Asian options are: 


1. Floating strike Asian call. Payoff = max(Sr — Ar,0); 
2. Fixed strike Asian call. Payoff = max(Ar — K,0). 


In practice, the geometric Asian option prices are used as a control variate in 
simulating their arithmetic counterparts. 

Let us illustrate the procedure by considering a fixed strike Asian call. The 
geometric version of the option has the payoff max(Gr — K,0). Denote Xr 
by log Gr, i-e., 


1 st 
Xv = al log S(r) dr. 
By Ito’s lemma, 
logS, = logS,+v(7 —t) +o(W(r) — W(t)) (Recall: v = r — 07/2), 


which implies 


Xp = xref ve S(r) dr 
T = tp Tp : g 
«4 eek Bet (Bag otf? 
a Xin t TF log S, +v a + Fr / W(r) dr —(T —t)W, 
wy Hes! der Sat T(W: —w)- [raw 
= tm t F og Se ty Tr a T t : T 
t T-t (T —t)? af 
= — | (T—r)dW,, 
Xin t T log S; +1 oT + (T —7) 


where the second to last line uses the integration by parts formula, see Ex- 
ample (2.2). By It6’s identities, see Exercise 1(d) of Chapter 2, we have 


T 
B | ([—r)dW(r) =0 and Var 
t 


[ Sa) awe) ra ie (Tr) dr = P= 


t)° 


3 
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Therefore, 


T- 
a log Sty 


Xp (2X4 Cie ot). (7.2) 


2T 372 
Risk-neutral valuation asserts that 
cle (t, S, Gt) = gen [max(e*" — K,0)}. 


Applying Lemma 3.1, we obtain the closed form solution as 


t ‘ ; 
CE(tS,G) = s($) eh B(d1) — Ke B(da), (7.3) 


where 
? T log 2 + tlog St +(r- cad eM +o? gD 
dy ae (To1)3 ’ (7.4) 
[2 Fav" 
: ‘ T —t)3 
dy Sa 2 5 !) ; 
o?\ (T -t)? (T — t)8 
BP). {po zc BOY Ae) 
R(t; T) ¢ 5 ) aT +o Te r(T —t) 


With the analytical solution of the geometric Asian call, we simulate the 
arithmetic Asian price via control variate. The algorithm is presented as 
follows. 


STEP 1: Generate daily stock prices S(t1), S(t2),... ,S(tn). 


STEP 2: Set 
i=1 


_ 


STEP 3: Repeat Steps 1 and 2 N times. 
STEP 4: Compute the regression coefficients a and b by fitting 
Ci, =a+06Ch, jf =1,2,...,N. 


STEP 5: Ch’* = a+ bC£'"(t,S,G;) with formula (7.3) applied. 


Example 7.2 Consider the parameter values: S; = 10,r = 0.03,0 = 0.4,t = 
0.2,T =1 and the realized arithmetic average A; = 10.5. Simulate the arith- 
metic Asian call option with a fixed strike price of $12. 


ASIAN OPTION 


We implement the preceding algorithm by the following SPLUS code: 


##Define parameters### 

N <- 1000 

TIME <- i 

time <- 0.2 

r <- 0.03 

Sigma <- 0.4 

St <- 10 

K <- 12 

At <- 10.5 

Gt <- 10.5 

nu <- r -— (sigma*2/2) 

aT <- TIME-time 

dt <- TIME/100 

n <- floor (dT*100) 

S <- c(rep(St,N)) 

A <- c(rep(At*(100-n)+St ,N)) 
InG <- c(rep(log(Gt)*(100-n)+log(St) ,N)) 
CA <- c(rep(0,N)) 

CG <- c(rep(0,N)) 


#Generate asset price paths and averages 

for (j in 1:n){ 

S <- S * exp ( nu*dt + sigma*sqrt (dt)*rnorm(N,0,1) ) 
A<-A+S5 

InG <~ 1nG + log(S) 

} 

A <- A/101 

InG <- 1nG/101 

CA <- exp(-r*dT) *pmax(A-K ,0) 

CG <- exp(-r*dT) *pmax (exp (1nG) -K ,0) 
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#Compute the analytical solution of the geometric Asian option 


SIGMA <- sigma®2*dT*3/3/TIME*2 


di <- (log(St/K)+time/TIME*log (Gt/St) +nu*dT* 2/2/TIME+SIGMA) /sqrt (SIGMA) 


d2 <- di - sqrt (SIGMA) 
RtT <- nuxdT*2/2/TIME + SIGMA/2 - r*dT 


CGfix <- St*(Gt/St)*(time/TIME) *exp(RtT)*pnorm(di) - K*exp(~-r*dT) *pnorm(d2) 


#Compute the regression coefficients 
a <- which ( CA *CG !=0) 

aa <- coef( lm( CA ~ CG ) ) 

price<- aa(i]+CGfix*aa[2] 
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This simulation gives the arithmetic Asian call (AAC) price to be 0.1698. 
The analytical price for the geometric Asian call (GAC) is computed as 0.1318. 
The AAC is a bit more expensive than the GAC because the arithmetic mean 
always dominates the geometric mean. The computational time is about 10 
seconds. O 


7.5 AMERICAN OPTION 


American options allow the holder to exercise prior to maturity. This early 
exercise feature exists in major financial markets. The valuation and opti- 
mal exercise of American options is one of the most challenging problems in 
derivatives finance, especially when more than one factor is involved in the 
option contract. 

Although simulation techniques can be used to generate future scenar- 
ios, the forward looking feature of simulation complicates the valuation of 
American option, where optimal exercising policy have to be constructed via 
backward reduction. When an American put option is valued with binomial 
tree, one has to determine if it is optimal to exercise the option at each node 
in a backward manner. A practical approach to valuing American options 
with simulation is proposed by Longstaff and Schwartz (2001). This section 
presents the idea of American option pricing using this approach. 


7.5.1 Simulation: Least Squares Approach 


The best way to illustrate the least squares approach of Longstaff and Schwartz 
(2001) is by means of a concrete example. In the following numerical example, 
we shall introduce the algorithm in detail first and explain the concepts later. 


Example 7.3 Let S(0) = 10, r = 0.038, o = 0.4. Compute the value of an 
American put option with strike price K = 12 and maturity T = 1. For 
simplicity, assume that the option can be exercised at t = 1/3, 2/3 and 1. 


We use the formula S(t+ At) = S(t) exp[(r —o0?/2) At+o AW] to generate 
asset prices at exercise time points: t = 1/3,2/3 and 1. Table 7.1 gives eight 
sample paths. Terminal payoffs corresponding to each path, Y3, are given by 
the last column of the table. Discounting the sample mean of the terminal 
payoffs estimates the European put price to be $2.4343. This is a lower bound 
for the American put option. 

At time t = 2/3, the option holder must decide whether to exercise the 
option immediately or to continue the option when the option is in-the-money. 
To make the decision, the holder should compare the cash flows of immediate 
exercise with the expected payoff of continuation given the asset price at time 
2/3. Therefore, it is essential to estimate the conditional expected payoff. To 
do this, we collect the response variable Y3e—"4* and the explanatory variable 
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Path t=1/3 t=2/3  t=1 Ys =max(K — S(1),0) 

1 8.3826 9.9528 6.7581 5.2419 
2 11.9899 13.8988 14.5060 0 

3 13.1381 17.4061 13.4123 0 

4 68064 7.8115 10.6520 1.3480 
5 7.0508 9.1293 7.4551 4.5449 
6 11.2214 8.3600 9.2896 2.7104 
7 8.9672 8.7787 9.0822 2.9178 
8 11.5336 10.9398 8.6958 3.3042 


Table 7.1 Sample paths. 


S(2/3) for in-the-money paths in Table 7.2, where At = 1/3. We model the 
expected payoff from continuation at time t = 2/3 as a quadratic polynomials, 
fo(S:), of asset values at time t = 2/3. Coefficients of the polynomials are 
estimated from the data in Table 7.2 by the least squares method. Therefore, 
we estimate Go, @, and G2 from the regression line: 


Yze—"4* = Go + Gy [S(2/3)] + G2[S(2/3)]* + €. 


The resulting formula is 


E[Y3e~"4*|S(2/3)] = —82.5347 + 17.7788([S(2/3)] — 0.9063[S(2/3)]? := fe(S). 


Path Y3e""“! —§(2/3) 
1 5.1898 9.9528 
2 —— 13.8988 
3 -—- 17.4061 
4 13346 7.8115 
5 44997 9.1293 
6 2.6834 8.3600 
7 28888 8.7787 
8 38.2714 10.9398 


Yes 
No 
No 
Yes 
Yes 
Yes 
Yes 
Yes 


Exercise in-the-money? 


Table 7.2. Regression at t = 2/3. 


With this conditional expectation function, f2(S), we are able to compare 
the value of immediate exercise, K — S(2/3), and compute payoffs, Yo, for 
each path at t = 2/3. The value of Y2 is obtained by the formula, 


Y= Kk — §(2/3), if K — 8(2/3) > fo(S(2/3)), 
ao { eo aty,, 


otherwise. 
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This formula asserts that the payoff at time t = 2/3 is K — S if exercising 
the option is worth more than the expected payoff from holding it; otherwise, 
the payoff at time 2/3 becomes the discounted cash flow in the next exercise 
time. The last column of Table 7.3 gives the expected payoffs, Y2, for each 
sample path. 


Path Exercise Continuation e7"*'Y3 Y2 
K — S(2/3) — f2($(2/3)) 

1 2.0472 4.6380 5.1898 5.1898 
2 — coe 0 0 
3 — a 0 0 
4 4.1885 1.0428 1.3346 4.1885 
5 2.8707 4.2388 4.4997 4.4997 
6 3.6400 2.7554 2.6834 3.6400 
7 3.2213 3.6959 2.8888 2.8888 
8 1.0602 3.4968 3.2714 3.2714 


Table 7.3 Optimal decision at £ = 2/3. 


Next, we repeat the procedure for t = 1/3. In Table 7.4, all sample paths 
are in-the-money except path three. Then, the least squares estimation cor- 
responding to in-the-money paths gives 


E[Y2e~74*/S(1/3)] = —8.9488 + 3.3104$(1/3) — 0.2036[$(1/3)|? = fi(S). 


This regression function determines the exercising policy at t = 1/3. 


Path Y2e"t  S$(1/3) Exercise in-the-money? 


1 5.1381 8.3826 Yes 
2 0 11.9899 Yes 
3 0 13.1381 No 
4 4.1468 6.8064 Yes 
5 4.4549 7.0508 Yes 
6 3.6038 11.2214 Yes 
7 2.8600 8.9672 Yes 
8 3.2388 11.5336 Yes 


Table 7.4 Regression at t = 1/3. 
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Once again, the Y; in Table 7.5 is computed according to the optimal 
decision by the rule, 


a { Kg S0/9 EK Sa/9)> 11SE/9, 
17) e- 4*Y9, otherwise. 


Finally, the current price of the American option is estimated by the average 
of e74ty,, ie., $3.0919, which is higher than the European option price 


$2.4343. oO 
Path Exercise Continuation e77**Y> Y 
K -S(1/3)_fa(S(1/3)) 

1 3.6174 4.4921 5.1381 5.1381 
2 0.0101 1.4689 0 0 
3 —— — 0 0 
4 5.1936 4.1494 4.1468 5.1936 
5 4.9492 4.2688 4.4549 4.9492 
6 0.7786 2.5572 3.6038 3.6038 
if. 3.0328 4.3620 2.8600 2.8600 
8 0.4664 2.1440 3.2388 3.2388 


Table 7.5 Optimal decision at t = 1/3. 


7.5.2 Analyzing the Least Squares Approach 


Consider an American put option with exercise rights at ty <---<t, =T. 
To simplify matters, we assume t;4; —t; = At for 7 =1,2,...,n—1. Givena 
sample path of the underlying asset price, {S(t1), S(t2),... ,S(tn)}, we study 
possible payoffs received by the option holder at each of the exercise time 
points. Clearly, if the option is not exercised prematurely, then the holder 
receives the terminal payoff, denoted as Y, = max(K — S(t,),0). At time 
t = tn_1, the corresponding payoff, Y,_1, depends on the holder’s decision of 
exercising the option. Therefore, 


Ye K —S(tn-1), exercise, 
BES) cent eey continue. 


This formula indicates that the option holder receives K — S(tn_1) if the 
optimal decision is to exercise the option. Otherwise, the holder will receive 
a cash flow of Y, at the next time step. The present value of this cash flow 
is obtained through multiplying a discounted factor e~"4'. Inductively, the 
payoff Y; at time t; can be described as 


Y; =| kK — S(t;), exercise, 


= A 7.5 
eT4ty. 11, continue. (e) 
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This iterative process stops until Y; is obtained. Since the option holder has 
no exercise right in the time period [0,t1), the American put option can be 
viewed as a European option that expires at t; with payoff Y|. Risk-neutral 
valuation allows us to value the American put, P4(0, 5), as 


P4(0, S) = E [e~™¥i|Sp = S]. 


Therefore, a typical simulation algorithm generates N sample paths, each 
follows the algorithm to obtain fy), wea Yo}. The American put is esti- 
mated by 


N 
1 . 
Pa(0, 5) = 5 oem”. (7.6) 
i=l 


The above simulation is incomplete, however. To simulate the American 
put, the payoff, Y;, at time t; should be obtained via simulation. This re- 
quires the simulation algorithm to detect optimal exercise at each time point 
successively. In other words, we have to clarify the condition of exercising the 
option in (7.5). It is crucial that the optimal decision should not be made 
by simply comparing the values of K — S(t;) and e~"*Y;4, in (7.5). The 
reason is that the decision at time t; should be based on the information up to 
t;. However, the value Y;,1 depends on the asset value at t;,1. The correct 
approach is to compare the immediate exercise cash flow K — S(t;) with the 
expectation on the discounted cash flow conditional on the asset price S(t;). 
This leads (7.5) to 


ae K — S(t;), if K — S(t;) > f;(S(t;)), 
ag { eTMY sy, if K — S(ty) < fy(S(ty)), (7.7) 


where f;(S(t;)) is the conditional expectation function at t;, that is, 


f(S(t3)) = E [ew ja.115(t,)] - (7.8) 


The key to the Longstaff and Schwartz (2001) approach is the use of least 
squares to estimate the function, f;(S). Under certain technical conditions, it 
can be shown that the function f;(S(t,;)) can be approximated by a polynomial 
of S(t;). In other words, 


(Sts) = Do a (Sa), 


k=0 


where {a,} converges to zero rapidly. Therefore, one way to approximate 
f;(S) is by truncating the polynomial of infinite order to a finite order poly- 
nomial. Coefficients of the finite order polynomials are estimated through the 
least squares method. 
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In Example 7.3, we use a polynomials of degree 2 to approximate f;(S). 
The simulation starts by generating N asset price paths, {S;(t1),... , Si(tn)} 
for i = 1,2,...,N. When ¢ = tn, it is clear that Y<? = max(K — S;(t,), 0] 
for the path 7. We go one step back to the time point t = t,_1, where N 
possible asset prices have been generated. Then, the coefficients ap, ai, and 
ag are obtained by taking least squares estimation to the regression line: 


fa-1(S) = Ee-"*¥,|$] = a9 + a1[S(tn—1)] + a2[S(tn—1))?. (7.9) 


The estimation is based on the sample {(Si(tn—1), Ya? )|K > Siltn—-1),t = 
1,...,N}, ie., in-the-money paths. Then, payoffs at t,_1 are calculated via 
the rule (7.7). Having a sample of payoffs {yi = 1,2,...,N} at tr-1, we 
go one step back to the time point tp_2 and repeat the process. Eventually, we 
obtain N possible payoffs,{Y}|i = 1,2,... , N}, at t1. Monte Carlo simulation 
estimates the current option price by the average in (7.6). 


Remarks: 


1. In the regression equation (7.9), only in-the-money paths are used in 
the least squares estimation as these paths are sensitive to immediate 
exercise. Remember that the option holder will exercise the option only 
when it is in-the-money. 


2. An obvious way to improve the accuracy is to increase the number of 
terms in (7.9). However, one has to strike a balance between increasing 
the number of terms and the quality of estimates. Numerical experi- 
ments show that polynomials of degree 3 is a reasonable choice. 


3. Instead of using ordinary monomials as basis functions in (7.9), one 
may consider other basis functions, like Hermite, Laguerre, Legendre, 
Chebyshev, Gegenbauer, and Jacobi polynomials. Numerical tests of 
Moreno and Navas (2003) show that the least squares approach is quite 
robust to the choice of basis functions. For more complex derivatives, 
this choice can slightly affect option prices. 


4. The recent analysis of Stentoft (2004) indicates that a modified spec- 
ification using ordinary monomials is preferred over the specification 
based on Laguerre polynomials used in Longstaff and Schwartz (2001). 
Furthermore, the least squares method is computationally more efficient 
than other numerical methods, such as finite difference, especially when 
high dimensional problems are concerned. 


5. The paper by Longstaff and Schwartz (2001) points out that the R? 
values of the regressions are often low. This means that the volatility 
of unexpected cash flows is large relative to the expected cash flows. 
However, since the least squares simulation is based on conditional first 
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moments rather than higher moments, the R?’s of the regression should 
have little impact on estimated American option price. 


6. If the user is really concerned about the R?, it may be more efficient to 
use other techniques such as weight least squares and GMM in estimat- 
ing the conditional expectation function. 


Example 7.4 Using the parameters in the preceding example, simulate the 
American put price with continuous exercise rights and hence determine the 
optimal exercise policy. The simulation is based on 10,000 sample paths with 
At = 1/100. 


The SPLus code is as follows: 


N <- 10000 
n <- 100 
dt<- 1/n 

r <- 0.03 
sigma <- 0.4 
SO <- 10 

K <- 12 


nu <- r-sigma*2/2 

stock <- matrix(0,N,n+1) 

y <- c(rep(0,N)) 

put <- c(rep(0,N)) 

boundary <- c(rep(0,n)) 

stock[,1] <- SO 

checki1 <- proc.time() #the first check point 


# generate asset price paths 

for (i in i:n){ 

stock[,i+1] <- stock[,il* exp(nu*dt+sigma*sqrt (dt)*rnorm(N,0,1)) 
} 

y <- pmax( (K-stock[,n+1]), 0 ) 

for (j in n:2){ 

a <~- which( stock[,j]<K ) # identify in-the-money paths 

if ( length(a) >= 3) { # ensure there’s a solution for the regression 
# Compute the conditional expectation function 

S <- stock[a, j] 

A <~ coef( 1m( y[a] ~ S + S°2, singular.ok="T" ) ) 

if ( is.na(A[3]) ) { A[3]<-0 } 

X <- matrix( c(rep(1,N),stock[,j],stock[,j]~2), ncol=3 ) 

put <- X Z*% A 

put <- exp(-r«dt) *pmax (put ,0) 


# determinate Y & find boundary of K-S(t) <f£(S(t)) 
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b <- which( (K-stock[,j]) > put ) 

y <- exp(-r#dt)*y 

y[b] <- K - stock[b,j] # assign y as K-S when K-S > P 

if ( length(b)==0 ) {boundary[j]<-NA} # boundary cannot be estimated 
else { boundary[j] <- max(stock[b,j]) } 

} 

else { 

y <- exp(-r*dt) *y 

boundary[j] <- NA 

} 

} 

check2 <- proc.time() #the second check point 

boundary[n+1] <- K 

boundary[1:20] <- NA #give up the estimate for t<0.2 

time<-c(0:n)/n 

plot (time, boundary ,type="h" ,ylim=c(0,K) ,xlab="time",ylab="asset price") 


price <~ exp(-r*dt)* mean(y) 
check2-checki #Check the CPU time for the valuation 
price #price of American Put Option 


time 


12 


10 


asset price 
4 6 


2 


Fig. 7.1 The exercising region of the American put option. 


By using quadratic conditional expectation functions, our simulation esti- 
mates the American put price as 2.739 within 15 seconds, which is consistent 
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with the binomial model of Hull (2006). For the early exercise policy, we 
collect the maximum asset value that belongs to the exercising region at each 
time. For t > 0.2, Fig. 7.1 plots the exercise policy against time. The option 
is optimal to exercise if the stock price falls into the shaded region. It is seen 
that the early exercise boundary looks like an increasing function of calendar 
time and hence a decreasing function of option maturity. For t < 0.2, our 
simulation has no path in the exercising region so that we are unable to graph 
the exercising boundary. O 


7.5.3. American-Style Path-Dependent Options 


The examples considered so far are relevant to pricing American put options; 
the least squares approach is applicable to any early exercisable contingent 
claims. Denote the terminal payoff function of a path-dependent option by 
F (Sr, €r) where € is an exogenous variable. For instance, €r = Simin(T) for 
a barrier option or a lookback option and &r = Ar for an Asian option. The 
American style path-dependent option with payoff F(Sr, €r) can be simulated 
as follows. 


STEP 1: Generate asset price paths {5;(t1), Si(te),... , Si(tn)} for 7 = 
1,2,...,N. Set j =n—1 and Y; = F(S;(tn), &:(tn)). 

STEP 2: Use least squares to estimate coefficients of a polynomials of 
degree m, Pm(Si(t;), &(t;)) from: 


e Sth | = Pin Silty), &i(ts))s 


for in-the-money paths. 
STEP 3: If F(S;(t;), &:(t;)) = Pm(Si(ty), €i(ts)), then set Y? = F(Si(t;), (ts); 
otherwise, set Y; = ety Bas 


STEP 4: If 7 > 1, then set 7 = 7 — 1 and go to STEP 2. 
STEP 5: The American option price = + ae er otye. 


Example 7.5 Suppose So = 100,r = 0.03,0 = 0.4,T = 7/12 (7 months). 
Simulate the American style floating strike arithmetic Asian put option and 
plot the optimal exercise regions for t = 0.2,0.4,0.6 and 0.8. The simulation 
is based on 10,000 sample paths with At = 1/100. 


We approximate the conditional expectation function, f;(S, A), by a two- 
variable quadratic polynomials, i.e., 


f; (S, A) = aoo + ajo5 + 9S? + aySA + agiA = os ag A?. 


The SPLUS code is as follows. 
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N <- 10000 

n <- 100 

dt<- i/n 

r <- 0.03 

sigma <- 0.4 

SO <- 10 

nu <- r-sigma™2/2 

stock <- matrix(0,N,n+i) 
K <- matrix(0,N,n+1) 

y <- c(rep(0,N)) 

put <- c(rep(0,N)) 
stock[,1] <- SO 

K[,1] <- SO 

A <- matrix(0,6,n) 
checki <- proc.time() #the first check point 


#Generate asset price paths and realized averages 

for (j in i:n)f 

stock[,j+i] <- stock[,j]*exp( nu*dt + sigma*sqrt (dt)+rnorm(N,0,1) ) 
K£, j+i] <- K[,jltstock[,j+1] 

} 

for (j in 1:(@+1)) f K£,3] <- KL,j]/j } 

y <- pmax( (K[,ntil]-stock[,n+1]), 0 ) 


#Compute conditional expectation function 
for (j in n:3){ 

# Collect in-the-money paths 

a <- which( stock[,j] < K{,j] ) 

if ( length(a) >= 6) f 


§ <- stock{a,j] 

zeta <- K{[a,j] 

A(,j] <- coef( 1m( y[a] ~ S$ + S°2 + zeta + zeta°2 + S*zeta ») 
Xx<- c(rep(1,N) ,stock[,j] ,stockL,j]°2,KL, j] ,K£,j17°2,stock[, j]*KC,j]) 
X <- matrix( xx, ncol=6 ) 

put <- X%*7ZAL, 5] 

put <- exp(-r*dt)*pmax (put , 0) 

# Find S that is in the exercising region 

b <-~ which( (K[,j]-stock[,j]) > put ) 

y <- exp(-r¥dt)*y 

ylb] <- K[b,j]~stock[b,j] # assign y as K-S when K-S > P 

} 

else { y <- exp(-r*dt)*y } 

} 

price <- exp(-2*r*dt)* mean(y) 
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check2 <- proc.time() #the second check point 
check2-checki #Check the CPU time for the valuation 


HHRHHHEHEHHHEHEEHEARABAHAEA BAHAR REE H AER R ERR BER H 
## Plot the exercising region at t=0.2,0.4,0.6,0.8 ## 
## The continuation region is the shaded region ## 
HEFHHRRHREHRHRHARERHRER REARS HABER AREA BRR ER HR RRR R AH 


par (mfrow=c(2,2)) 
for (j in c(20,40,60,80)) { 


xx<-c(rep(1,N),stock[,j],stock[,j]°2,K[,j],K[,j]72,stock[, jJ*K{,jJ) 


X <- matrix( xx, ncol=6 ) 

put <- exp(-r*dt) *XZ*ZA[,j] 

b <- which((K[,j] - stock[,j]) > put) 

if (j==80) {plot( K[b,j], stock[b,j], type="h", xlab="average", 


ylab="asset price", xlim=c(5,13), ylim=c(5,13), main="t = 0.8")} 


if (j==60) {plot( K[b,j], stock[b,j], type="h", xlab="average", 


ylab="asset price", xlim=c(5,13), ylim=c(5,13), main="t = 0.6")} 


if (j==40) {plot( K[b,j], stock[b,j], type="h", xlab="average", 


ylab="asset price", xlim=c(5,13), ylim=c(5,13), main="t = 0.4")} 


if (j==20) {plot( K[b,j], stock[b,j], type="h", xlab="average", 


ylab="asset price", xlim=c(5,13), ylim=c(5,13), main="t = 0.2")} 


Our simulation estimates the option price to be 9.783. This number is 
consistent with the one obtained by the finite difference method in Hansen 
and Jorgensen (2000). The CPU time is about 17 seconds for the computa- 
tion. Fig. 7.2 plots the exercise boundaries at time 0.2, 0.4, 0.6 and 0.8. The 
boundaries are the interfaces between shaded and nonshaded regions. The 
shaded regions are those of the continuation regions. For ¢ = 0.2, there are 
less points falling into the exercising region. Thus, the simulation is only able 
to graph the exercise boundary for underlying asset prices in the range of 7 
to 11 at t = 0.2. oO 


7.6 GREEK LETTERS 


As pointed out in Chapter 5, hedging is sometimes more important than pric- 
ing in risk management. Option hedging requires risk managers to compute 
option Greeks, like delta, gamma, vega, and theta. We refer interested read- 
ers to Hull (2006) for the application of Greeks in hedging and Joshi (2003) 
for discrete tree approximation. The Greek letters are actually representing 
partial differentiations of the option pricing formula with respect to different 
parameters. Since most options, especially path-dependent options, do not 
have closed form pricing formulas, Greeks should be obtained by means of 
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asset price 
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Fig. 7.2 Exercise regions of the American-style Asian option. 


simulation. For single asset path independent options, the simulation can be 
constructed via Theorem 5.3. However, it is inapplicable for path-dependent 
options. Thus, we introduce an alternative practical approach to simulating 
Greeks. 

Let V denote the pricing formula of an option. The option Greeks are 
defined as follows. 


Delta = oa 
Gamma = oY 
Vega = oo 
V 
Theta = ae 
ov 
Rho = or 


where S is the underlying asset price, o is the volatility, t is the time vari- 
able and r is the spot interest rate. Hence, the Greeks can be obtained by 
standard differentiation techniques or approximated by the numerical finite 
difference method (FDM) if the option pricing formula is available. The FDM 
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computes numerical differentiation by approximating the first principle in dif- 
ferentiation. For instance, suppose we are interested in the Delta of an option. 
Then, the FMD approximates the value by 


V(S +h) —V(S) 


, (7.10) 


Delta ~ 
where A is an arbitrarily chosen small number and other parameters are fixed. 
The approach introduced here combines simulation with FDM together. 
Suppose we need the Delta of an option. Then we proceed as follows. First, 
the option price is simulated as usual with the current realized asset price S. 
Second, we resimulate the option price again with a “perturbed” asset price 
S+h. Finally, the Delta is approximated by (7.10). However, the stability of 
this approach would be of great concerns since there are two sources of errors, 
simulation error and FDM error. The most critical one is the simulation error, 
which makes the numerator of (7.10) nonzero even when A tends to zero. To 
circumvent this difficulty, it is very common for market practitioners to use the 
same set of random numbers in the first and the second steps. We illustrate 
these ideas with the down-and-out call option in the following example. 


Example 7.6 Suppose S(0) = 100,r = 0.05,0 = 0.4,T =1 (1 year). Es- 
timate the delta of down-and-out call option with a strike price of 95 and 
provision on a downside barrier of 80. 


fd = = 


0.8 1.0 1.2 


estimated delta 


frequency 
1 15 20 25 30 


5 


0.6 


Fig. 7.3 The strike against the delta of a down-and-out call option. 
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We base our simulation on 100,000 sample paths, each of which is divided 
into 100 equally-spaced intervals. Therefore, this simulation requires 10 mil- 
lion independent normal random variables, namely ¢,; with i = 1,2,... , 100 
and j = 1,2,... , 100,000. Using the set of {€;;}, we produce the sample paths 
as {S;(t:),... ,S;(ti00)} using the Black-Scholes dynamics of asset price with 
S;(0) = 100 for all j. Therefore, we get the Cao price as in Section 7.2. To 
obtain delta, we repeat the above procedure by assuming S;(0) = 100+ h, 
where h = 0.01, to estimate the option price again. It is important to recall 
that we must use the same set of «,;. After that, the delta is approximated 
by the finite difference method. Our simulation estimates the delta of the 
down-and-out call option to be 0.863. Fig. 7.3 shows the distribution of the 
delta estimates over 100 simulations. The corresponding programming code 
is given as follows. oO 


### compute the delta for European Down-and-Out Call Option ### 
### parameters ### 


N <- 100000 
S0<- 100 

L <- 80 

r <- 0.05 
sigma <- 0.4 
nu <- (r-sigma™2/2) 
t<- 1 

n <- 100 
dt<- t/n 

h <- 0.05 

K <- 93:110 


returnS <- matrix(0,N,n) 
deltaC <- matrix(0,4,18) 


### estimation of delta #### 

for (i in 1:4){ 

returnS <-matrix(rnorm(N*n,nu*dt ,sigma*sqrt (dt)),N,n) 
for (k in 1:10){ 

Si <- c(rep(S0,N)) 

$2 <- c(rep(S0+h,N)) 

outi <- c(rep(i,N)) 

out2 <- c(rep(1i,N)) 


##find the terminating price and whether it touches the barrier## 
for (j in 1:n){ 

$1<-Si*exp(returnS[,j]); 

$2<-S2*exp(returnS[,j1]); 


outi <- ifelse (S1>L,1*out1,0); 
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out2 <- ifelse (S2>L,1*0ut2,0); 
} 


Ci <- mean(ifelse (outi==1, pmax(Si-K[k],0), 0))*exp(-r*t) ; 
C2 <- mean(ifelse (out2==1, pmax(S2-K[k],0), 0)) *exp(-r*t) ; 
deltaC[i,k] <- mean(C2-Ci)/h *exp(-r*t) 

print (c(C1,C2,deltaC[i,k])) 

} 

} 


### true value of delta (for K>boundary only) ### 
eta2<- 2*r/(sigma*2) + 1; 

a <-(log(SO/K) +(r+sigma*2/2)*t)/sigma/sqrt (t) ; 

b <-(log(L*2/K/S0) +(r+sigma”2/2) *t) /sigma/sqrt () ; 
DOCdelta <- pnorm(a)+(eta2-1)*(L/SO) “eta2*pnorm(b) ; 
temp1<-(eta2-2) *exp(-r*t) *(L/S0) * (eta2-2) ; 
temp2<-K/S0*pnorm(b-sigmaxsqrt (t)) ; 

DOCdelta <-DOCdelta -tempi*temp2; 


a<-c(0.7,1.3) 

plot (K,deltaC[1,],type=’1’ ,xlab=’strike’ ,ylab=’delta’ ,ylim=a) 
for (i in 2:4) { lines(K,deltaC[i,],type="1") } 

Lines(K ,DOCdelta, type="0") 


Other Greek letters can be obtained in a similar manner. For instance, 
the Gamma is the second-order partial differentiation of the option pricing 
formula with respect to the underlying asset price. To estimate its value, we 
can approximate the second order differentiation by central finite differencing 
such that 
V(S +h) — 2V(S) + V(S — h) 

h2 
Therefore, we are required to compute V(S —h) on top of V(S) and V(S +h). 


Gamma ~ 


Example 7.7 Using the input parameters in Example 7.6, plot the gamma 
of down-and-out call option against strike price, where the strike price varies 
from 93 to 110. 


####compute the gamma for European Down~and-Qut Call Option 
HHHHHHHHHHHAHHHHHHHEHRHEEHHH RARER AER RS 

### parameters as the same as before ### 
HHHHHHAHHHHHRHRHHHHAHARHEHBHRHRRAR ER RES 

returnS <- matrix(0,N,n) 

deltaC <- matrix(0,4,18) 


for (i in 1:5) { 
returnS <- matrix(rnorm(N*n,nu*dt,sigma*sqrt (dt)) ,N,n) 
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for (k in 1:18){ 

Si <- c(rep(S0-h,N)) 
$2 <- c(rep(SO_ ,N)) 
S3 <- c(rep(S0+th,N)) 
outi <- c(rep(1,N)) 
out2 <- c(rep(1,N)) 
out3 <- c(rep(1,N)) 


#find the terminating price and whether it touches the barrier# 
for (j in 1:n){ 

S$1<-Si*exp(returnS[,j]); 

$2<-S2*exp(returnS[,j]); 

$3<-S3*exp(returnS[,j]); 

outi <- ifelse(S1>L,1*out1,0); 

out2 <- ifelse(S2>L,1*out2,0); 

out3 <- ifelse(S3>L,1*out3,0); 

} 


C1 <- mean(ifelse (outi==1, pmax(Si-K[k],0), 0))*exp(-r*t); 
C2 <- mean(ifelse (out2==1, pmax(S2-K[k] ,0), 0))*exp(-r*t) ; 
C3 <- mean(ifelse (out3==1, pmax(S3-K[k] ,0), 0))*exp(-r*t) ; 
gammaC[i,k] <- mean(C1-C2*2+C3)*exp(-r*t) /h°2; 

print (c(C1,C2,C3,gammaC[i,k])) 

} 

} 


a<~-c(min(gammaC) ,max (gammaC) ) 
plot (K,gammaC[1i,] ,type=’1’ ,xlab=’strike’ ,ylab=’ gamma’ ,ylim=a) 
for (i in 2:4) { lines(K,gammaC[i,],type="1") } 


7.7 EXERCISES 


1. Verify equations (7.1) and (7.3). 


2. By modifying Example 7.1, simulate the price of down-and-out call, 
which will be knocked out if the underlying asset price goes below $8. 


3. By modifying Example 7.1, simulate prices of a fixed lookback put option 
and a floating lookback call if the fixed strike price and the realized 
minimum asset prices are both $8. Verify the lookback put-call parities 
of these options. 


4. Show that American call option price equals that of its European coun- 
terpart if the underlying asset pays no dividends. In other words, the 
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American call option is never optimal to exercise prior to maturity if 
the underlying asset pays no dividends. 


. By modifying Example 7.4, simulate the price of an American call option 


with a strike of $12 and a dividend yield 6 of 4%. Hint: The risk-neutral 
dynamics of an asset paying continuous dividend yield is given by 


@ = (r- 6) dt +odW. 


What is the optimal exercising policy from your simulation? Plot the 
critical asset prices against time. 


. Forward start option is a path-dependent option that the strike price 


will be set as the underlying asset price in the future. For instance, the 
forward start call option payoff is 


max(Sr — S:,,0), 
where 0 <t, <T. 


(a) Suppose So = $10, o = 0.4,r = 0.1, 6 = 0.5, T = 0.5 and ty = 0.3. 
Construct and implement an algorithm for the forward start call 
option with 1,000 sample paths. 

(b) Denote Cgs(S,t; K,T) by the Black-Scholes formula for the stan- 
dard call option. Based on financial insights, a risk analyst specu- 
lates that the forward start call option is the discounted standard 
call price. That is 


Current forward start call price = e~™ Cgg(So, t1; So, T). 


Verify this conjecture by your simulation. 


(c) Suppose that the option has a continuous early exercise right after 
t =t,. Determine the option price by the least squares simulation 
with 10,000 sample paths. 


Simulation Techniques in Financial Risk Management 
by Ngai Hang Chan and Hoi Ying Wong 
Copyright © 2006 John Wiley & Sons, Inc. 


Multi-asset Options 


8.1 INTRODUCTION 


Multi-asset options are exotic options whose payoffs depend on values of mul- 
tiple assets. Multi-asset options abound in the financial market. An obvious 
example is index options, where the underlying variable, the financial in- 
dex, can be thought as a portfolio of multiple assets. Challenges of valuing 
multi-asset options are the curse of dimensionality and the lack of analytical 
tractability. Those problems can be circumvented by simulations. 

Some examples of multi-asset options traded in the financial market are 
first introduced. Let $1, S9,... ,S, denote the prices of n different assets. 


1. Exchange options: the right to exchange an asset for another. Thus, the 
option payoff is max(S, — cS2,0), where c is a constant multiplicative 
factor. This option is useful, for example, when a US investor wants to 
buy Japanese yen with eurodollars. 


2. Quanto options: options on stocks in a foreign country, i.e., involving 
the exchange rate. If we treat S,; as the exchange rate and S»2 as the 
underlying asset in the foreign country, then there are a number of 
possible quanto option payoffs, like S; max(S2—K,0), max(S;S_—K,9), 
max(Sj,C), max(S2 — K,0), and C max(S2z — K,0), where C is a fixed 
constant. The last payoff function appears to be of a single asset option. 
However, the volatility of the exchange rate, S;, does contribute to the 
option price if S; and So are correlated. 
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3. Basket options: options S on a portfolio. The payoff of a call on a 
portfolio is max(II — K,0), where Tl = 3°", aiSj. 


4. Extreme options: options on the extrema of different assets. The max- 
imum call option has the payoff: max {max($1, S2,... ,S,) — K, 0}. 


All multi-asset options can be traded with European or American style. Com- 
plex multi-asset options, or structured products, may even involve path de- 
pendent features. In such cases, simulations are indispensable. 


8.2. SIMULATING EUROPEAN MULTI-ASSET OPTIONS 


Consider an option on two assets with payoff F(S;(T), So(T’)). In the risk- 
neutral world, assets are assumed to follow the dynamics of 


dS; 


Ss = TattoidW,, i= 1,2, (8.1) 
a 


where 
E(dW,dW2) = pdt, (8.2) 


and E denotes the risk-neutral expectation. Then, the option can be simulated 
via the Cholesky decomposition (see Theorem 4.4). 


Example 8.1 Suppose S,(0) = S2(0) = 10, 0, = 0.3, o2 = 0.4, p = 0.2, 
and r = 0.05. Simulate the price of an exchange option with maturity of six 
months. 


By Itd’s lemma, we derive the terminal asset prices as 


S(T) = S1(Oe~1/ DT + X1VT and $o(T) = So(O)e("~22/DT +22XevT (g 3) 


La los(Lo fl, t]) 


The option price, Cx, can be determined by evaluating the expectation: 


where 


Cx = e~'TE {max(Sy(T) — S2(T),0)}. 

We estimate the option price by the following simulation algorithm. 
STEP 1: For i = 1 to N, do Step 2 to Step 4 as follows: 
STEP 2: Generate Z;, Zz N(0,1) iid. 

STEP 3: Set X, = Z; and Xo = pZ, + /1— p? Zp. 
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Fig. 8.1 The distribution of simulated price. 


STEP 4: Compute S‘)(T) and S$? (T) by (8.3). 


STEP 5: Set Cx = S$" ON, max(S'(7) — sf (7), 0). 


Fig. 8.1 plots the distribution of the estimated price over 100 simulations. 
We obtain the estimated option price to be 0.962. This algorithm is imple- 
mented with SpLus and the programming code is given as follows: O 


SO <-c(10,10) 

r <- 0.5 

sigmal<-0.3 

sigma2<-0.4 

rho <-0.2 

T <-0.5 

N<-10000 

Z1<-rnorm(N) 

Z2<-rnorm(N) 

X1<-Z1 

X2<-Z1*rho+Z2*sqrt (i-rho*2) 

S1<-SO[1] *exp((r-sigmai~2/2)+*T+sigmai*sqrt (T)*X1) 
$2<-S0 [2] *exp( (r-sigma2*2/2)*T+sigma2+*sqrt (T) *X2) 
C<~mean (pmax ($1-S52,0)) *exp(-r*T) 

Cc 
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8.3 CASE STUDY: ON ESTIMATING BASKET OPTIONS 


In practice, basket options are often valued by assuming that the value of the 
portfolio of assets comprising the basket follows the Black-Scholes dynamics 
jointly rather than each asset follows the Black-Scholes dynamics individually. 
After estimating the portfolio volatility from the portfolio return, the basket 
call option is valued by substituting the portfolio volatility into the Black- 
Scholes formula. This approach offers a quick solution to traders. However, 
the risk manager needs to understand the risk of this simplifying assumption. 
We examine this approach by means of simulation. 


200 


stock price 
150 


100 


50 


time 


Fig. 8.2 The historical price of shocks. 


Consider a basket call option with three underlying assets, $1, S2, and S3. 
The payoff of this option is max(S, + Sg + S3 — K,0). In other words, the 
holder of the option has the right to purchase the portfolio as a sum of the 
three assets for a fixed value of K. Suppose the current time is t = 1 and we 
observe the prices of three assets since t = 0. Fig. 8.2 depicts the paths of the 
three simulated asset. prices. 


#### simulate the historical stock price #### 

n<-250; r<-0.05; dt<-i/n; 

sigma <- c(0.4,0.3,0.35); 

SIGMA <- matrix(c(1,0.3,0.3,0.3,1,0.3,0.3,0.3,1) ,3,3); 
SIGMA <- SIGMA * ( sigma %*% t(sigma) ) 
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S <- matrix(0,3,n+1); 

S[,1] <- ¢c(100,90,80); 

for (i in 1:n) { 

S[,it+1]<-S[,i] *exp(r*dt+rnorm(3)%*%chol (SIGMA) *sqrt (dt) ) ; 
} 

P <- S[1,1+S[2,]+S(3,]; 

print (P) 

returnS <- diff(t(log(S))); 

returnP <- diff(log(P)); 


cor (returnS) 
A <- var(returnS) /dt 
B <- var(returnP) /dt 
C <- S[,n+1] 


time<-(0O:n)/n; 

range<-c(min(S)*0.5,max(S)*1.2); 

plot(time,S[1,],type=’1’ ,xlim=c(0,1.6), ylim=range, ylab=’stock price’) 
lines(time,S[2,],type=’1’) 

lines(time,S[3,],type=’1’) 

lines(c(1,1),range,type=’1’) 

HH#H# cont’d ##### 


At t = 1, the asset prices are S, = 142.69, Sp = 89.23, and S3 = 49.73. The 
current portfolio value is the sum of and equals 281.65. Based on the three 
asset price paths, the portfolio volatility is estimated to be 0.280. Consider the 
basket option with a strike price of 250, maturity of half a year and interest 
rate of 5%. The naive application of the Black-Scholes formula produces a 
value for the option as 44.81. 

On the other hand, we can use MC simulation to estimate the option 
price by assuming individual asset follows the Black-Scholes dynamics. By 
examining the asset price paths, we estimate the variance-covariance matrix 
for assets returns as 

0.172 0.050 0.043 
0.050 0.088 0.038 
0.043 0.038 0.123 


Then, we simulate asset prices at t = 1.5 using the Cholesky decomposition 
for 10,000 times. Fig.8.3 illustrates the idea of generating asset values at 
t = 1.5, the maturity of the option. Terminal values of individual assets are 
simulated based on an approach similar to (8.3) with three assets. The option 
price is then evaluated by discounting the sample mean of the option payoff 
using the interest rate of 5%. The simulated option price is 51.35, which is 
larger than the naive approach of 44.81. 
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We are also interested in the contribution of the error in estimating pa- 
rameters of the option. We perform a control experiment assuming that 
the variance-covariance matrix can be estimated without error. Input the 
variance-covariance matrix as 


0.1600 0.0360 0.0420 
0.0360 0.0900 0.0315 
0.0420 0.0315 0.1225 


Using the same set of independent normal random numbers, we obtain the 
option price as 50.71. It seems that the error of estimating the variance- 
covariance matrix does not contribute too much to basket option values. 
Therefore, the MC and the naive approach to valuing basket options can 
produce significantly different results, irrespective to the estimation error. 

In practice, banks and financial institutions usually have a lot of derivatives 
positions in their portfolio. Risk managers are responsible to check for the 
consistency of models that are used to value individual derivatives in the 
portfolio. Imagine a situation that a bank buys and sells options on individual 
assets and a basket of assets everyday. When individual assets are assumed to 
follow the Black-Scholes dynamics, it is crucial for the risk manager to realize 
what kind of assumptions have been imposed. The simulation shows that it 
is not appropriate to assume the portfolio constituting the basket to follow 
Black-Scholes dynamics jointly because this assumption is not consistent with 
the assumption on individual assets. In such a case, the value of basket options 
can be significantly underestimated. 


##HH cont’d ##### 

range<-c(min(S)*0.5,max(S)*1.6); 

plot(time,S[1,],type=’1’ ,xlim=c(0,1.6), ylim=range, ylab=’stock price’) 
lines(time,S[2,],type=’1’) 

lines(time,S[3,],type=’1°) 

lines(c(1,1) ,range,type=’1’) 

print (C) 


#### simulate scenarios ### 

K<-250; t<-0.5; 

di <- (log(sum(C) /K) +(r+B*2/2) *t) /sqrt (B¥t) ; 
d2 <- di-sqrt (B*t) ; 

sum(C) *pnorm(d1)-K*exp(-r*t) *pnorm(d2) ; 


N <- 20000; 
§ <- matrix(0,N,3); 
S{,i]<-cC(il; S£,2]<-C[2]; S(,3]<-c{[3]; 


for (i in 1:125){ 
S<-S*exp(r+dt+matrix(rnorm(3*N) ,N,3)%*%chol (A) *sqrt (dt)) ; 
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stock price 


Fig. 8.3 Simulating terminal asset. prices. 


} 


P<-S[,1]+SL,2]+8[,3]; 
price<~exp(-r*T) *pmax(P-K ,0) ; 
mean (price) 


for (i in 1:20){ 
lines(c(1,1.5),c(C[1],S[i,1]) ,type=’1’) 
lines(c(1,1.5),¢(C(2] ,S[i,2]) ,type=’1’) 
lines(c(1,1.5),c(C[3] ,S[i,3]) ,type=’1’) 
} 


8.4 DIMENSIONAL REDUCTION 


For an n-asset option, simulation can be constructed by using the Cholesky 
decomposition (4.6). However, this requires generating n independent normal 
random variables for each scenario. To reduce the computational burden, we 
can use the principle component analysis (PCA) to approximate the n factors 
by a smaller number of factors, usually less than 10 in practice. 

Suppose we have an n dimensional random vector X ~ N(0,¥) where D is 
an n X n variance-covariance matrix. PCA for normal random variables is to 
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approximate X by Y which follows a distribution similar to that of X but is 
easier to simulate. 

PCA uses the eigenvalue decomposition in Chapter 4 to approximate the 
random vector X. Let Aj, A2,...,An be eigenvalues of & and v1, v2,...,Un 
be the corresponding eigenvectors. As variance-covariance matrices are posi- 
tive definite, their eigenvalues are all positive real numbers so that the corre- 
sponding squared roots are positive real numbers. Theorem 4.6 asserts that 
the random vector 


X = VA Z1 + VAgveZe +--+ AnUnZns (8.4) 
where Z;, Zo,... , Zn, arei.i.d. standard normal random variables. The equal- 


ity (8.4) is defined in the sense of distribution. In PCA, we arrange eigenvalues 
in descending order such that 43 > Ag > --- > Ay. From (8.4), we see that 
the contribution of the term /A;v;Z; to the value of X decreases with the 
index i. The eigenvector v; is called the i-th principle component (PC). To 
approximate X, we truncate the sum in (8.4) such that 


X Vr 21 + VJ r2¥2Z2 +++ + VAmUmZm, 


where m <n. If we are comfortable with this approximation, we then simulate 
m independent standard normal random variables Z; and calculate everything 
based on this the approximation. 

An important topic in PCA is to determine the value m. The number 
of terms used in the approximation depends on the accuracy of the outcome 
required by the modeler. If the user requires 100% accuracy besides simulation 
error, then he should use formula of (8.4). PCA is useful when the user 
requires an accuracy that is less than 100%. Suppose he requires an accuracy 
of at least 99%. Then, m is the minimum integer such that 


Mm 
dizi \i > 99%. 


iat Xi 


A proof of this result can be found in standard texts in multivariate analysis, 
for example Anderson (2003). 

Let us apply PCA in multi-asset option pricing. Consider an option with 
10 underlying assets. Each asset follows the Black-Scholes dynamics such that 


dS; = 14,5; dt +0;5;dW;, 1 = 1,2,...10, 
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where S; is the value of the i-th asset. The W,, Wo,... ,Wio are correlated 
Brownian motions with correlation matrix: 


1.00 0.74 0.34 —0.08 0.05 —0.74 0.04 —0.12 0.81 0.82 
0.74 1.00 0.81 -0.04 -0.57 —0.25 0.06 0.47 0.89 0.92 
0.34 0.81 1.00 —-0.17 —0.83 0.20 —0.09 0.78 0.65 0.72 
—0.08 -—0.04 —0.17 1.00 0.01 —0.05 0.94 -0.04 -0.09 —0.05 
0.05 —-0.57 —0.83 0.01 1.00 —0.55 0.00 —0.94 -0.41 —0.45 
—0.74 —0.25 0.20 —0.05 —0.55 1.00 —0.16 0.65 —0.40 —0.40 
0.04 0.06 —0.09 0.94 0.00 —0.16 1.00 -0.06 0.04 0.06 
—0.12 0.47 0.78 -0.04 —0.94 0.65 -—0.06 1.00 0.31 0.34 
0.81 0.89 0.65 -0.09 —0.41 —0.40 0.04 0.31 1.00 0.91 
0.82 0.92 0.72 0.05 0.45 0.40 0.06 0.34 0.91 1.00 


A discrete approximation to the asset price dynamics is 
AS; = rS; At+ S;vV Ate, 


where e; are risk factors such that |e;] = X ~ N(0,), and ¥ is the correlation 
matrix given above. 

For the given correlation matrix, eigenvalues are obtained as 4.719, 2.843, 
1.931, 0.147, 0.104, 0.079, 0.062, 0.056, 0.038, 0.022. Summing up all the 
eigenvalues gives a value of 10. When we divide the sum of the first three 
eigenvalues by the total sum, the ratio is close to 95%. Therefore, if we 
accept an error of 5%, the first three PCs provide sufficient accuracy. Eigen- 
vectors corresponding to the first three PCs are found to be: 


wi: 0.31 0.44 0.41 -0.05 -—0.31 ~0.07 -—0.00 0.27 0.42 0.43 
v2: 0.41 0.08 —0.22 0.09 0.41 —0.57 0.14 -0.45 0.18 0.16 
v3: 0.09 —0.03 0.01 —0:70 0.12 —0.05 -~-0.69 -—0.10 0.02  -0.00 
Based on the first three PCs, we generate three independent standard normal 
random variables, namely Z;, Z2, and Z3, and approximate the n risk factors 


by 
les] & ZJ/ v1 oF Zo r2 02 + Z3,/X3 V3. 


The 10 risk factors €,,€2,...,€19 are reduced to three independent factors 
only. Hence, we reduce a 10-dimensional problem to a 3-dimensional problem. 


Example 8.2 Value a marimum option on 10 assets with a strike price of 
$95 and a maturity of half a year. All asset values are currently $100 with 
volatilities of 30% for all assets. The correlation matriz of risk factors is given 
above. The interest rate is 4%. We accept a maximum error of 5%. 


The option payoff is max|/max(Sj, S2,... ,S19) — 95,0]. As the option is 
traded in European style, it is efficient to simulate terminal asset values di- 
rectly. By Itd’s lemma, we know that the terminal value of the i-th asset is 
given by 


S(T) = $,(0)exp [(r — 07 /2)T + o,W,(T)| 
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= §,(0)exp [(r — 0? /2)T + ovT «| ‘ (8.5) 


where the vector {e;] ~ N(0,©). Our simulation obtains the option price to 
be 49.12. The corresponding programming code is given as follows. oD 


#### input the correlation matrix #####t 

A<-matrix(0,10,10) 

A[1,]<-c( 1.00, 0.74, 0.34,-0.08, 0.05,-0.74, 0.04,-0.12, 0.81, 0.82); 
A[2,]<-c€ 0.74, 1.00, 0.81,-0.04,-0.57,-0.25, 0.06, 0.47, 0.89, 0.92); 
A[3,]<-c( 0.34, 0.81, 1.00,-0.17,-0.83, 0.20,-0.09, 0.78, 0.65, 0.72); 
A{4,]<-c(-0.08,-0.04,-0.17, 1.00, 0.01,-0.05, 0.94,-0.04,-0.09,-0.05) ; 
A{5,]<-c( 0.05,-0.57,-0.83, 0.01, 1.00,-0.55, 0.00,-0.94,-0.41,-0.45); 
A[6,]<-c(-0.74,-0.25, 0.20,-0.05,-0.55, 1.00,-0.16, 0.65,-0.40,-0.40); 
A(7,]<-c( 0.04, 0.06,-0.09, 0.94, 0.00,-0.16, 1.00,-0.06, 0.04, 0.06); 
A[8,]<-c(-0.12, 0.47, 0.78,-0.04,-0.94, 0.65,-0.06, 1.00, 0.31, 0.34); 
A[9,]<-c( 0.81, 0.89, 0.65,-0.09,-0.41,-0.40, 0.04, 0.31, 1.00, 0.91); 
A[10,}]<-c( 0.82, 0.92, 0.72,-0.05,-0.45,-0.40, 0.06, 0.34, 0.91, 1.00); 


sigma <-0.3 

A <- A*xsigma”™2 

eigenvalues <- eigen(A)$values 

eigenvectors<- eigen(A)$vectors 

print(eigenvalues) ##check that the first 3 values are the largest 


### input parameters ##### 
SO0<- 100; 

<- 95; 

<- 0.04; 

<- 5000; 
1; 

<- matrix(0,n,3); 

<- matrix(S0,10,n); 
price <- c(rep(0,n)); 


nAwe BAAR 
A 
' 


### generate scenarios ##### 

for (j in 1:3){ B[,j] <- rnorm(n,0,1)*sqrt(eigenvalues[j]) } 
returnS <- eigenvectors[,1:3] %*%t(B) ; 

S <- Sxexp((r-sigma™2/2)*t +returnS*sqrt(t)); 


for (i in 1:n) {price[i] <- max(max(S[,i])-K,0)} 
value<-mean (price) *exp(-r*t) 
value 
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8.5 EXERCISES 


1. Suppose x(t) and y(t) are two correlated Itd’s processes such that 


dx = a(t,x)dt+ b(t, x) dW, 
dy = a(t,y)dt+ B(t,y)dWe 
E(dW,dW2) => pdt. 


Consider a function, f(t,z,y), which depends on both stochastic vari- 
ables of x(t) and y(t). By modifying the proof of Theorem 2.1, show 
that the dynamic of f(t, x, y) is 


_ (of Of Of OF. se OF Of 
i= (F a a 2 Ax? 3 2 Oy? PP Faas a 
of of 
+b a dW +2 ay dW». (8.6) 


This formula is known as the It6’s lemma for two variables. 


2. Answer the following questions by considering the property of martin- 
gales defined in Question 7 of Chapter 3. 


(a) Consider a pair of asset price dynamics under the risk-neutral mea- 


sure: 
dS; = rS,dt+o,S,dw, 
dSy = rSodt+o2S2dWs 
E(dWidW2) = pdt. 


Show that the stochastic process X(t) = S(t)/S2(t) is a martingale 
under the Brownian motions W7(t) and W(t) where 


W(t) = Wi(t) — poot and W(t) = Walt) — ont. 
(b) Under (a), show that X(t) has the dynamics: 


se =odw*, 


where W* is a Brownian motion and 
os ae — 2pay,02 + os. 


(c) Consider a function of 5 and S2, V(t, S1,52), which has the prop- 
erty that 


V(t, S1, $2) = S2U(t, S1/S2) = SgU(t, X). 


Show that U(t, X) is a martingale under Brownian motions Wy (t) 
and W(t). 
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3. Consider the exchange option with payoff max($1(T) — S2(T),0). De- 


note the option pricing formula for this option as Vez(t, Si, 52). By 
using the no-arbitrage argument, one derives that the exchange option 
has the properties: 
e There exists a function U such that Vez (t, 1,52) = S2U, S;/S2). 
e There exists a probability measure Q such that X(t) = S,(t)/S2(t) 
is a martingale. 


Based on these properties and the results obtained in Question 2, show 
that 
Ver = S1O(dj) — S2¥(d)), 


where 


In($1/S2) + 07(T — t)/2 


di = 

, ovi—-t 
dj = dt—-oVT-1t 
o* = of —poio2+a%. 


Herein, 0; and o2 are volatilities of S; and So, respectively, and p is the 
correlation coefficient between the returns of two assets. 


. Run the simulation program for pricing exchange option and compare 


the numerical result with the analytical one. 


5. The so-called geometric basket option has the payoff function 


n 1/n 
max (i si) — K,0 
i=1 


(a) Show that this option has a value less than the usual basket option 


with payoff 
max 1 5° SP) - Ko} . 
i=1 


(b) Suppose individual assets follow the Black-Scholes dynamics. De- 
rive the analytical pricing formula for the geometric basket option. 


(c) By regarding the price of the geometric basket option as a control 
variate, simulate the price of the usual basket option that depends 
on four assets with the following correlation matrix: 


1.0000 0 0.3000 0.3000 

0 1.0000 0.4000 0.2000 
0.3000 0.4000 1.0000 0.3000 
0.3000 0.2000 0.3000 1.0000 
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We assume all assets sharing the same volatility of 30% and each 
asset individually follows the Black-Scholes dynamics. 


6. Use simulation to determine the value and early exercise policy of Ameri- 
can style exchange options. We assume the interest rate of 5%, 5; = 100, 
So = 95, T = 1 year and the variance-covariance matrix of asset returns: 


0.016 0.006 
0.006 0.09 /° 


Hint: you may use the LS model and a quadratic polynomial of 5; and 
Se in your regression. 
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Interest Rate Models 


9.1 INTRODUCTION 


Fixed income securities are concerned with the valuation of promised pay- 
ments at a future date. For example, a zero coupon bond promises to pay a 
single payment on the maturity day. A straight US Treasury bond promises 
to make payments with the amount and date of the payments determined by 
the face value, maturity date, and coupon rate of the bond. Because cash 
flows are certain, we are not concerned with the risk of the volatility of the 
amount of cash. Instead we are interested in the following questions: How 
much would a rational individual be willing to pay today for a promised pay- 
ment in the future? The answer to this question is related to the interest rate 
movement. This leads to the next question. How to manage the interest rate 
risk? Simulation can serve as a useful tool to gain some insights in answering 
these questions. 


9.2 DISCOUNT FACTOR 


Consider the simplest case: a zero coupon bond (zeros) paying $100 a year 
from now. What is the maximum value one is willing to pay for this contract 
today? Purchasing this bond should be worth at least as much as putting the 
money into the bank. Let P be the payment at the current moment. Then, 


P(1 +R) = 100, 
159 
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where RF is current annual interest paid by a bank (R is supposed to be a 


constant). That is 
100 


~T+R 
P and TR are known as the zeros price and discount factor, respectively. 


One may regard the discount factor as the zeros price with unity payment on 
the maturity T. We usually write 


P(0,T)=P and B(0,T) = discount factor from T to 0. 
For a zero coupon bond, we have 
P(0,T) = P(T,T)B(0,T). (9.1) 


Suppose that the interest rate is paid in continuously compounding. Then, 
the discount factor becomes 


Bt, T) = e-7 7-9, (9.2) 


Notice that the continuously compounding interest rate r and the annual 
interest rate R are connected by the formula 


1 ps 
{R 


e” 


What about multiple dates at which payments will be made? A typi- 
cal bond will pay coupons at semi-annual intervals and a principle payment 
at maturity. The key to evaluating such kinds of bonds is to view dollars 
promised at different future dates as separated zero coupon bonds. We then 
value each payment at each date using the discount factor for that date fol- 
lowed by summing up the values. For a series of cash flow C(t,;) at time 
ty, ta,...,¢4, the coupon bearing bond can be valued by the formula: 


N 
P(O,T) = >_ C(t) B(O, 4). (9.3) 


i=1 
For example, the value of a bond paying semi-annual coupon is given by 


C(1/2) c(1) C(N) 


POM) = TR GR tt RP 


9.2.1 Time-Varying Interest Rate 


The previous examples value bonds when the interest rate is assumed to be 
a constant. A more realistic model considers the time-varying deterministic 
interest rate. For instance, we assume that the continuously compounding 
interest rate r is a function of time t, ie. r = r(t). We are concerned with the 
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evolution of discount factor B(t,T), which can be viewed as the zeros price 
at t with unity payment at T. The rate of return of this zero coupon bond 
must be the instantaneous interest rate r(t). Thus, we have 

dBit, T) 


Solving the above ordinary differential equation gives 
T 
BUT) =e 4 O48, (9.4) 


If we use y(t, T) = ws i r(s) ds to denote the mean of the interest rate in 
the interval [t, 7], then (9.4) can be written as 


Bt, T) = e ¥bTT-9 


which agrees with (9.2). We can use (9.1) and (9.3) to evaluate zero prices 
and coupon-bearing bonds, respectively. As y(t, 7’) is also defined as the yield 
of a bond, the discrete yield to maturity Y; is given by 


Y% =l1- e7 ¥(tt+1) | 


Y; and y(¢,T) are the constant annual rate and constant instantaneous rate 
of a bond. Bond markets usually quote the yield Y; in place of the interest 
rate r(t). y(t, T) and r(t) are usually identical if and only if r(t) is a constant 
value. Further discussions about yields and interest rate models can be found 
in Jarrow (2002). 


9.3. STOCHASTIC INTEREST RATE MODELS AND THEIR 
SIMULATIONS 


Deterministic interest rate models are inadequate to capture interest rate 
movements as no one knows future interest rates for certain. A better ap- 
proach is to incorporate the stochastic feature of the interest rates. A stochas- 
tic interest rate model should match as (9.4) when the stochastic component 
is absent. A natural way is to consider 


B(t,T) =E G i ee) (9.5) 


where W, is a vector of stochastic factors. If stochastic factors are absent, 
the function inside the expectation becomes deterministic and the expectation 
equals to the function itself. 

From a simulation perspective, expression (9.5) offers a means to conduct 
Monte Carlo simulations. Once an appropriate stochastic interest rate model, 
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such as the Vasicek model of Vasicek (1977), the CIR model of Cox, Ingersoll 

and Ross (1985), the Ho-Lee model of Ho and Lee (1986), and the Hull-White 

model of Hull and White (1988), is formulated, simulations can be conducted. 
To illustrate the ideas, consider a short rate model that follows 


dr = a(t,r) dt + B(t,r) dW, (9.6) 


where r is the current continuous compounding interest rate and W; is a 
Wiener process. For example, the Vasicek model assumes that a(t,r) = a(b— 
r) and G(t,r) = o whereas CIR model uses the same a(t,r) with G(t,r) = 
o,/r. Sample paths of short rate models of the form (9.6) can be generated 
by the following steps: 


STEP 1: Set r; = ro be the current market rate 
STEP 2: Generate « ~ N(0, 1) 
STEP 3: Set rig. = 7; + a(t;, 74) At + B(ti, 75) eV At 


STEP 4: Go to Step 2 go 
Let r9) = {r(t):t = 0,4,2,...,T} be the j-th interest rate path out 
of M sample paths generated by the preceding algorithm with At = 4, By 


means of quadrature, we can make the following approximation 
t2 ; 1 ; 
/ rD(t) dt ~ = ye r)(t), 
2 ” elt tal 
If we take At = x the discount factor B(0,1) can approximated by 


1 280 3 
E Sees (9) { 
{o0( 280 a (aa))} 
‘ 


B(0, 1) 


2 


Ie 
Ms 

oO 

tal 
Oo 

| 
| 
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oO 
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“oom 
Z| 
C|]s 
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In general, we write 


M 
B(t,T) We (-3 sy “1 


2 


j=l t,€[t,T] 
LS (3) 

- a7 ex? (Addu etnir (ti) x (T-8)). (9.7) 
j=l 


Consider the Vasicek model proposed in Vasicek (1977), 


dr(t) = a(b — r(t)) dt +o dW. 
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The discrete version is 
r(t + At) = r(t)(1 — aAt) + abAt + o(Wisat — Wi). 
A sample path can be generated by 


ry = ro(1—aAt)+abAt + eoVAt, 
rg = 1ry(1—aAt) + abAt + eoVAt, 


Tm = Tr-i(1~ aAt)+abAt+ e,ovAt, 


where At = T/n. 
Using (9.5), one obtains a representation for the zero coupon bond price 
for the Vasicek model as 


B(0,T) ~ Bex (- arn) } 


Moreover, (9.7) can also be used to estimate a zero coupon bond price. To 
achieve an efficient simulation procedure, we need to scrutinize the preceding 
approximation more carefully. Express r, in terms of €),... ,€) and 79, 


Tm = (1—aAt)"ro + abAt [1 + (1 ~aAt) +---+(1—adt)"“}] 
t+oVAt [(1 — a@At)”~4e, +--+ (1 — aAt)en—1 + En] - (9.8) 


Since r, equals to a constant plus a sum of normal variates, it is normally 
distributed. The coefficient of abAt is a geometric progression {1 — (1 — 
aAt)"|/(aAt). This implies that the sum X = —At)>r; is also a normal 
random variable. To obtain an efficient simulation, it is worth finding out the 
mean and the variance of X. Note that 


— 1 —(1-—aAt)' 
Done = rod caty-eabary Cant ) +o oVA ten _ 
Simplifying, 
(1 — aAt) — (1 — aAt)"t aN —(1- aAt)' 
(ro - py haat) — C= oli + bn+oVB ED Oe: aAt 
Hence, 
= (ary nae n+1 
Bx SS ees ») 2 at) — (1 — eAt) _ or, 

Ge : 

Var(X) = > > — (1—aAt)'/? 
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91 — (1 —aAt)” 


o? 1—(1—-aAt)" 
= ga te ehh ae) alt + 1 — (1 —aAt)? 


_ (1 — aAt) 


Therefore, the Vasicek discount factor can be estimated by 


M 
B(0,T) ~ a Deeb ace 


j=l 


where z; ~ N(0, 1). 
To improve the simulation further, we observe that Ee* is actually the 
moment generating function of X, which equals 


exp [ex a 5Van(x) (9.9) 


The closed-form formula for the Vasicek discount factor can be obtained by 
allowing n — oo (or At — 0) in the mean and the variance expressions. The 
limit is 


B(0,T) (9.10) 

= expy—bT —-(r- 2 = ai + o tenet —e T 4 dT — 3] 

a e a 4a3 , 
This closed-form formula allows us to check the accuracy of simulating the 
Vasicek model. 

To define the yield in a stochastic interest model, we follow (9.4) and define 
the yield y(t,T) as 
B(t,T) = e ¥bTVT-9, 


Equivalently, 


~ log B(t, T) = = log aa, 


wet) = Tot PTD) 


In practice, it is sometimes more relevant to know the yield of a bond rather 
than its value because market practitioners are used to extract bond price 
from the yield curve. 


9.4 OPTIONS WITH STOCHASTIC INTEREST RATE 


In Chapter 5, we simulate option prices for constant interest rate models. We 
now study simulating option price with stochastic interest rate. Recall that 
the call price can be written as 


Call = Discount factor x E[max(Sr — K,0)}. 


Two features of stochastic interest rate are: 
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1. The discount factor should be calculated under a stochastic interest 
model. 


2. The movement of the interest rate may affect the stock price. 


The first feature can be tackled by calculating the discount factor using the 
method mentioned earlier. For the second feature, we consider the Black- 
Scholes asset dynamics under the Vasicek interest rate economy. The stock 
movement and interest rate movement are, respectively, 


dS; = r(t)S; dt +015,dW, and dr(t) =a(b—r(t)) dt +0; dW, 
or, equivalently, 
Sta. = Sptr(t)S, At + 0, S;e,/VAt, (9.11) 
r(t+1) = r(t)+a(b—r(t)) At + ogegVAt. 


Suppose that the correlation coefficient of the stock and the interest rate is p. 


 Taden((8}(2 4). 


Computer software usually generates independent normal random variables 
by default. To generate correlated random variables, the following transfor- 
mation of variables can be made: 


2 = €1; 
€2 — Pej 


rr 


It is easily seen that 


Var(z1) = 1, 
Var(z2) = 1, 
Cov(z1 : 22) — 0, 


and 2; and zg are iid. standard normal random variables. We generate 
(€1,€2) from the following procedures. 


STEP 1: Generate 21, 22 ~ N(0,1) iid. 
STEP 2: Set e; = 21 and eg = zo\/1 — p? + pz. 


Substitute €; and €2 into (9.11) to generate future possible asset prices and 
interest rates. The call option price can be obtained by the standard Monte 
Carlo method as 


ry! n 
C(S,0) 2) y max(S;(T) — K,0), (9.12) 


where B(0,T) is the Vasicek bond price obtained in (9.10) and S;(T’) is the 
terminal asset price for the j-th path. 
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9.5 EXERCISES 


1. Verify (9.10) by computing the limit. 
2. Consider the CIR model: 


dr = 0.1(0.05 — r) dt + 0.3\/r dW and ro = 0.052. 


(a) Construct and implement a standard Monte Carlo simulation to 
compute the discount factor. 

(b) Use the Vasicek discount factor of (9.10) as a control variate to 
improve the simulation in (a). (Hint: you may use Ovasicek = 
ocrRyT0-) 

(c) Compare the difference between two prices based on 1,000 simu- 
lated prices. 


3. Consider the Ho-Lee interest rate movement: 
dr = 6(t) dt +o dW, (x) 


where 0(t) = a+ e~™, o =constant and W is the standard Brownian 
motion. 


(a) Provide an algorithm to compute B(0,t) by discretizing (*). 


(b) For pricing a five-year bond paying coupons semiannually, you 
adopted At = 1/250 to calculate the integration, and M = 1,000 
to estimate each discount factor. What is the minimum size for the 
random sample used to compute the bond price with simulations 
in (a)? 

(c) Express r; in terms of a, b, 0, h, and ro based on the algorithm in 
(b), where h = At. Hence, show that 


i.e et 
r= Po bat+ — +eovt, when h—0, 
where € ~ N(0, 1). 


(d) Modify the approach of Section 8.2.1 to derive closed form solution 
for the discount factor under the Ho-Lee model. 
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Markov Chain Monte 
Carlo Methods 


10.1. INTRODUCTION 


Bayesian inference is an important area in statistics. It has found applications 
in a spectrum of disciplines. One of the main ingredients of Bayesian inference 
is to incorporate prior information via the specification of prior distributions. 
As information flows freely in financial markets, incorporating prior informa- 
tion with Bayesian ideas constitutes a natural approach. In this final chapter, 
we shall briefly introduce the essence of Bayesian statistics with reference to 
risk management. In particular, we shall discuss the celebrated Markov Chain 
Monte Carlo method in detail and illustrate its uses via a case study. 


10.2 BAYESIAN INFERENCE 


The essence of the Bayesian approach is to incorporate uncertainties for the 
unknown parameters. Predictive inference is conducted via the joint prob- 
ability distribution of the parameters 6 = (61, 02,... ,8,-) conditional on the 
observable data rz = (21,...,2,). The joint distribution is deduced from the 
distribution of observable quantities via Baye’s theorem. Many excellent texts 
have been written about the Bayesian paradigm; see, for example, DeGroot 
(1970), Box and Tiao (1973), Berger (1985), O’Hagan (1994), Bernardo and 
Smith (2000), Lee (2004), and Robert (2001), just to name a few. A succinct 
introduction to Bayesian inference for time series is given in Tsay (2006). 
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The observational (or sampling) distribution f(z|@) is the likelihood func- 
tion. Under the Bayesian framework, a prior distribution p(@) is specified for 
the parameter 6. Inferences are conducted based on the posterior distribution 
m(@\x) according to the following identity: 

0)p(0 
x (6\x) = £(212)008) 


f(z) ¢ 
where f(z) is the marginal density such that 


f(a) = / F(2|O)p(0) d. (10.1) 


The probability density function 7(6|z) is known as the posterior density 
function. Since x is observed, the marginal density in (10.1) is a constant. It 
is more convenient to express (10.1) as 


m(O\x) x L(0)p(8), (10.2) 


where L(0) = f(z|@) is the likelihood function. One way to estimate @ is to 
compute the posterior mean of @, i.e., 


6 = / Ox(8|x) dO. (10.3) 


Prior and posterior are relative to the observables. A posterior distribution 
conditional on x can be used as a prior for a new observation y. This process 
can be iterated and eventually leads to a new posterior via the Bayes theorem. 
We illustrate this idea with a concrete example. 


Example 10.1 Suppose we observe 21,... ,2n independent random variables 
each N(,07) with wp unknown and o? known. Estimate y in a Bayesian 
setting. 


The likelihood function is 


Lib) = ery ew \-s3s Stn -w? x exp [357 -u)'], 


where Z is the sample mean of the observation. It seems natural to assume 
that yz follows a normal distribution by specifying the prior p(j:) ~ N(m,7?), 
where m and 7? are known as hyperparameters. Substituting this prior into 
(10.2), we have 


re) on Ao | m 


(ue ae 


} 
2T{ 


od exp | 
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where 


equivalently, 
p~ N(m1,7?). 


The posterior mean fi = E(yz) = my, is an estimate of uw given x. Notice 
that m, tends to the sample mean % and 7? tends to zero as the number 
of observation increases. In most cases, the prior distribution plays a lesser 
role when the sample size is large. Another interesting observation is that 
the information contained in the prior becomes less when 7? increases. When 
when T? — 00, p() « constant and m(u\z) = N(Z,07/n). Such a prior is 
known as the noninformative prior as it provides no information about the 
distribution of p. 

There are many ways to specify a prior distribution in the Bayesian setting. 
Some prefer noninformative priors and others prefer priors that are analyti- 
cally tractable. Conjugate priors are adopted to address the latter concern. 

Given a likelihood function, the conjugate prior distribution is a prior dis- 
tribution such that the posterior distribution belongs to the same class of 
distribution as the prior. Conjugate priors and posterior distributions are 
differed through hyperparameters. Example 10.1 serves as a good example. 
Conjugate priors facilitate statistical inferences because the posterior distri- 
butions belong to the same family as the prior distributions, which are usually 
of familiar forms. Moreover, updating posterior distributions with new infor- 
mation becomes straightforward as only hyperparameters have to be updated. 

In the one-dimensional case, deriving conjugate priors is relatively simple 
when the likelihood belongs to the exponential family. Conjugacy within 
the exponential family is discussed in Lee (2004). Table 10.1 summarizes 
some of the commonly used conjugate families. Herein, Be denotes the Beta 
distribution, G the Gamma distribution, JG the Inverse Gamma distribution, 
and N the Normal distribution. 


Likelihood L(6) Conjugate prior p(6) 
Poisson 9 = G(a, B) 
Binomial 6 = p Be(a, 8) 


Normal 6 = pt, 0? known N(m,7”) 
Normal 6 = 0’, w known IG(a, 8) 


Table 10.1 Conjugate priors. 
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10.3. SIMULATING POSTERIORS 


Bayesian inference makes use of simulation techniques to estimate parameters 
naturally. As shown in (10.3), calculating a posterior mean is tantamount to 
numerically evaluating an integral. It is therefore not surprising that Monte 
Carlo simulation plays an important role. The integration in (10.3) is usu- 
ally an improper integral (integration over an unbounded region), which ren- 
ders standard numerical techniques useless. Although one may use numerical 
quadrature to bypass such a difficulty in the one-dimensional case, applying 
quadrature in higher dimensions are far from being simple. Financial model- 
ing is usually of higher dimensions. 

Monte Carlo simulation with importance sampling simplifies the computa- 
tion of (10.3). Since it may be difficult to generate random variables from the 
posterior distribution 7(0|z) directly, we may take advantage of the fact that 
importance sampling enables us to compute integrations with a conveniently 
chosen density. Consider 


7 Or(O 

oe / on (6|z) dd = / m(6l) (9) a, (10.4) 
q(9) 

where q(@) is an a priori specified density function that can be generated 

easily. Drawing n random samples 6; from q(6), we approximate the posterior 

mean by 


Note that the importance sampling is not used as a variance reduction device 
here. It is applied to facilitate the computation of the posterior mean. The 
variance of the computation can be large in some cases. 


10.4 MARKOV CHAIN MONTE CARLO 


One desirable feature of combining Markov chain simulation with Bayesian 
ideas is that the resulting method can handle high dimensional problems effi- 
ciently. Another desirable feature is to draw random samples from the poste- 
rior distribution directly. The Markov Chain Monte Carlo (MCMC) methods 
are developed with these two features in mind. 


10.4.1 Gibbs Sampling 


Gibbs sampling is probably one of the most commonly used MCMC methods. 
It is simple, intuitive, easily implemented, and designed to handle multidi- 
mensional problems. The basic limit theorem of Markov chain serves as the 
theoretical building block to guarantee that draws from Gibbs sampling agree 
with the posterior asymptotically. 
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Although conjugate priors are useful in Bayesian inference, it is difficult to 
construct a joint conjugate prior for several parameters. For a normal dis- 
tribution with both mean and variance unknown, deriving the corresponding 
conjugate prior can be challenging. But conditional conjugate priors can be 
obtained relatively easily, see, for example, Gilks, Richardson and Spiegelhal- 
ter (1995). Conditioning on other parameters, a conditional conjugate prior 
is one dimension and has the same distributional structure as the conditional 
posterior. 

Gibbs sampling takes advantage of this fact and offers a way to reduce a 
multidimensional problem to an iteration of low-dimensional problems. Specif- 
ically, let z = (z1,... ,&n) be the data and let the distribution of each x; be 
governed by r parameters, 0 = (6), 69,... ,9,). For each j = 1,... ,1r, specify 
the one-dimensional conditional conjugate prior p(@;) and construct the con- 
ditional posterior by means of the Baye’s theorem. Then iterate the Gibbs 
procedure as follows. 

Set an initial parameter vector (68,...,0°). Update parameters by the 
following procedure: 


e Sample a?) ~ p(O:|08,... 69, x); 


e Sample 65) ~ p(62|0'", 68,... ,69, x); 


e Sample a) ~ p(6,|61, 63,... O49) 


This completes one Gibbs iteration and the parameters are updated to 
(61,... ,01). Using these new parameters as starting values, repeat the itera- 
tion again and obtain a new set of parameters (6?,... ,6?). Repeating these 
iterations M times, we get a sequence of parameter vectors 6),... 90”), 
where 6 = (61,... ,0%), for i = 1,...,M. By virtue of the basic limit 
theorem of Markov chain, it can be shown that the Markov chain {@(“)} 
has a limiting distribution converging to the joint posterior p(01, 62,... , @r|x) 
when M is sufficiently large, see Tierney (1994). The number M is called 
the burn-in period. After simulating {0(@++), @(M@+2)_.. @(@+")} from the 
Gibbs sampling, Bayesian inference can be conducted easily. For example, to 
compute the posterior mean, we evaluate 


. 1 n (M+ ) 
=-> GO”. 
i=1 
To acquire a clearer understanding of Gibbs sampler, consider the following 


example: 


Example 10.2 Let z,... , tp be independent N(p, 07) random variables with 
both and o? unknown. Estimate and o? via Gibbs sampling. 
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Recall that the conjugate prior of uz is normal for a given o? and that the 
conjugate prior of a? is inverse gamma for a given p. Let pup ~ N(mo, Te) and 
oe ~ IG(ao, 80) be random variables drawn from the initial priors. Define 4; 
and a? to be random variables generated in the i-th iteration of the Gibbs 
sampling procedure. The conditional posterior for 4; can be obtained by 
mimicking Example 10.1. We have 


Mi~ N(m, Te); 


where 


os 9 2 2 
Tj-1 + Mi-107_1/n 2  T197-1 
y= —— 7 3 and a, rn ae (10.5) 
Tj, + O7_,/n NTj_y + OF_y 


In Question 1 at the end of this chapter, the conditional posterior for o? is 
found to be 0? ~ IG(a;, B;) where 


i 1< 
a, =n/2+aji_-1 and B= Pes + 5 So (a; Ayes (10.6) 
j=l 


Hence, Gibbs sampling is implemented as follows: 
1. Set i = 0 and initial values of mo, 72, ao, Go, and 03; 
2. Sample py; ~ N(mj,72) and update aj41 and 641 by (10.6); 
3. Sample of, ~ I[G(ai41, B41) and update m4; and Tey by (10.5); 
4. Seti=i+1,; 
5. Go to Step 2 until ¢ equals a prespecified integer M + k. 


After that, we keep the last k pairs of random variables for indicies M +1 to 
M +k. Estimation is achieved by taking sample means: 


k 
S UM+ 3, 
j=l 


. 
ll 
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10.4.2 Case Study: The Impact of Jumps on Dow Jones 


To appreciate the usefulness of Gibbs sampling, we use it to estimate param- 
eters of a jump-diffusion model and examine the impact of jumps in major 
financial indices. Note that maximum likelihood estimation does not work for 
this model (see Redner and Walker, 1984). 
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In the jump-diffusion model of Merton (1976), the dynamics of asset returns 
are assumed to be 


dlogS = pdt+a0dW,+YdNi, (10.7) 


where S is the equity price, W; is the standard Brownian motion, N; follows a 
Poisson process with an intensity A, and Y is a normal random variable with 
mean k and variance s?. We assume that dW;, dN;, and Y are independent 
random variables at each time point t. This model requires estimation of 
pt, 0, A, k, and s based on observations {51,... ,Sn,5n+1}, where 5; represents 
the equity price observed at time t;. These prices produce n independent log- 
returns, which are denoted by {X1,...,Xn} where X; = log S; -— log Sj_1. 
With a fixed At, a discrete approximation to the dynamics (10.7) is 


AlogS = pAt+aAwWw,+Y AN,. (10.8) 


When At is sufficiently small, AN; is either 1, with probability AAt, or 0, 
with probability 1 — AAt. 


Example 10.3 Simulate 100 sample paths from the asset price dynamics of 
(10.7) with parameters: p = 0.08,0 = 0.4, = 3.5,s = 0.3, andk = 0. Each 
sample path replicates daily log-returns of a stock over a one-year horizon. 
Based on these 100 paths, estimate the values of pu, o, , s, and k with the 
Gibbs sampling. Compare the results with input values. 


Simulating paths 

Sample paths are simulated by assuming n = 250 trading days a year and so 
the discretization (10.8) has At = 1/250. On each path, | og-asset price at 
each time point is generated as follows, 


log S.1,-logS; = pAt + ov Ate, if U>dAt 
Britt 8% SY) uAttkht+ VoetAtt se, if U<AAt ’ 


where « ~ N(0,1) and U ~ U(0,1) are independent random variables. To 
simplify notations, we denote x; = log S;4;—log S;. The corresponding SPLUS 
code and a graph of three sample paths are given below. 


### generate observation of Y #### 


MUY <- 0.08; 
SIGMAY <- 0.40; 
MUJ <- 0; 


SIGMAJ <- 0.3; 
LAMBDA <- 3.5; 
m <- 100; 

n <~ 250; 
dt<- 1/250; 
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Y <- matrix(100,m,n+1); 


for (i in 1:n) f 

JUMP <- ifelse(runif (m) <LAMBDA*dt ,1,0); 

JumpSize <- JUMP*rnorm(m,MUJ,SIGMAJ) ; 

Y[,it1] <- Y£,i] + rnorm(m,MUY*dt ,SIGMAY*sqrt(dt)) + JumpSize; 


} 


plot(Y{i,], type=’1’ ,xlab=’time’ ,ylab=’stock price’) 
for (k in 2:100) f{ 

plot(Y(k,], type=’1’ ,xlab=’time’ ,ylab=’stock price’) 
} 


stock price 


i?) 50 100 150 200 250 


time 


Fig. 10.1 A sample path of the jump-diffusion model. 


Gibbs sampling 

There are five parameters in the model so that we have to develop five con- 
ditional conjugate priors from their conditional likelihood functions. Let us 
proceed step by step. 


1. Conditional prior and posterior for ju: 
Other things being fixed, the likelihood function of ~ happens to be 
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proportional to a normal density. Specifically, 


“ x; — pAt — Y;AN;)? 
Ln) x Jfep |S 
i=l 


n 2 


1 
x exp) a3 E - So (ai — Y,AN;) 


i=l 


Therefore, a normal distribution N(m, 7?) is suitable for 4 as a condi- 
tional conjugate prior. The posterior distribution can be immediately 
obtained as 


N 7? (ti — YiAN,;) +mo?/n 720? 
T24+02/n nT? +02 . 


(10.9) 


. Conditional prior and posterior for 0: 
The conditional likelihood function of o? 


nr 
2 2\—n/2 1 2 
L(o*) « (07) exp -aa he — pAt — Y;AN;) | . 
We select IG(a, 3) as the conditional prior for ¢?. Then, the posterior 
distribution becomes 


Woe (ti — pAt — Y;ANj)? 
2At 


IG (= +9/2,04 . (10.10) 


. Conditional prior and posterior for A: 
The conditional likelihood of » 


L(A) x (At) (1 = AAR)? 


where N is the total number of jumps in the horizon. From Table 
10.1, we find that the appropriate conjugate prior is Be(a,6). Simple 
computation shows that the posterior distribution is 


Be(a+N,bin-—N). (10.11) 


. Conditional prior and posterior for k: 

Since k is the mean of the normal jump size, its prior and posterior are 
obtained in the same manner as yp. We state the result without proof. 
The prior is N(my., 7?) and the posterior is given by 


N 
wf Lia Ki/N + mys?/N es?) (10.12) 
Te + s?/n "Nr? + 8? , 
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5. pendinonal prior and posterior for s?: 
Since s? is the variance of the normal jump size, its prior and posterior 
are obtained in the same manner as a”. The prior is ]G(ay, Gy) and 
the posterior is given by 


N 2 
IG (oy + N/2, By + bel i : (10.13) 


The above priors and posteriors are distributions conditional on values of Y; 
and AN;. This complicates the Gibbs sampling procedure since only 2; is 
observable for all i. Therefore, at each time point t;, Y; and AN; should be 
simulated from distributions conditional on the observed value of z; before 
substituting them into the priors / posteriors. We need the following facts: 


r|AN;=0 ~ N(puAt,o7At); 
riJAN;=1 ~ N(wAt+k,o7At +s), 


which together with Baye’s theorem show that 
P(x,jAN; = 1)AAt 


FASNG= 42) P(x,|AN; = 1)\At + P(ai|AN, = 0)(1 — At)’ 
P(AN; =0|z;) = 1-—P(AN; = 1[2;). (10.14) 


The jump size Y; is necessary only when AN; = 1. Under such a situation, 
we recognize that the condition density function of Y; is 


on io 2 By 
Flas) = Fled¥aC%) o exp | EEE PAN | exp [OEE ), 


which implies 


(x; — pAt)/o? At + k/s? 1 
Y; a ~ a Pee A a pp a fea a a Pee ™ a 
iy ee ( eat vie eke pie) 


With all the ingredients ready, the Gibbs sampling starts by choosing initial 
values of 19, 02, ko, Ao, and s2. We also need initial values of y,” and AN), 
both of which can be obtained by a simulation with the initial parameters. 
The Gibbs sampling runs as follows: 


1. Sample py ~ p(uj|oF_ 1, kj-1, s}_ 1» Aj—-1) as given in (10.9); 


( 
2. Sample o; ~ p(o Flay, k ets sf 1: Aj-1) a8 given in (10.10); 
3. Sample A; ~ p(Aj|4j, 07, ky—1, 87) as given in (10.11); 


4. Sample kj ~ p(k; 5 oa OF S44, Aj) as given in (10.12); 


( 
5. Sample s7 ~ p(s ia A;) as given in (10.13); 


8. 
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Sample AN) ~ PAN |1;,0 


?,kj,85) as given in (10.14) for all i = 
1,2,...,7; 


J1°F 


. Sample ¥,% ~ p(y |p;, a7, kj, 85) as given in (10.15) at the time point 


t, that AN; = 1; 


Set 7 = 7 +1 and go to step 1. Repeat until j = M’+M. 


Inference is drawn by taking sample means of the values of the last M simu- 
lated parameters. The SPLUS code is given as follows: 


### Input the stock price ### 
Y <- read.table("table.txt") 


Y <- 


log(Y) 


### initial value for the Markov chain ### 
dy <- diff(yY); 

lambda <- 6; 

mu <- 0; 

Sigma <- 1; 

k <- 0; 

s <- 1; 

jump<- c(rep(1,n)) 

jumpsize <- dY/2 


### assign prior distribution ### 
MEANmu<~-mean (dY) /dt/n; 
VARmu <- 1; 

MEANK <- 0.5; 

VARK <- 1; 

ALPHAsigma <- 2.5; 
BETAsigma <- 1/25; 
ALPHAs <- 2.5; 

BETAs <- 1/25; 
ALPHAlambda <- 2; 
BETAlambda <- 60; 


sample<-c(rep(0,n+5)) 
result<-c(rep(0,5)) 


### assign prior distribution ### 
MEANmu<-mean (dY) /dt/n; 

VARmu <- 1; 

MEANK <- 0.5; 

VARK <- 1; 

ALPHAsigma <- 2.5; 

BETAsigma <- 1/25; 
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ALPHAs <- 2.5; 
BETAs <- 1/25; 
ALPHAlambda <- 2; 
BETAlambda <- 60; 


sample<-c(rep(0 ,n+5)) 
result<-c(rep(0,5)) 


for (i in i:m) { 

### calculate the parameters for posterior distributions ### 
Vmu <- sigma*2/n/dt; #normal 

Mmu <- sum(dY-jumpsize) /n/dt; 

Vmu2<- 1/(1/Vmu + 1/VARmu) ; 

Mmu2<- (Mmu/Vmu+MEANmu/VARmu) /(1/Vmut+i/VARmu) ; 

mu <- rnorm(i,Mmu2,sqrt (Vmu2) ) 


Asigma <- n/2+1 #inverted gamma 

Bsigma <~ sum((dY-mutdt-jumpsize)~2)/2/dt; 
Asigma2<- Asigma + ALPHAsigma+1; 

Bsigma2<- Bsigma + BETAsigma; 

Sigma <~ 1/sqrt(rgamma(i,Asigma2,Bsigma2)) ; 


J <- jumpsize[jump==1] 

j <- sum(jump) 

Alambda <- j+i ; #BETA 

Blambda <- n-j+1; 

Alambda2<- Alambda + ALPHAlambda-1; 
Blambda2<- Blambda + BETAlambda-i; 
lambda <- rbeta(1,Alambda2,Blambda2)/dt; 


if (joi) { 
Mk <~ mean(J); #normal 
Vk <- s72/j; 
Vk2<- 1/(1/Vk + 1/VARK) ; 
Mk2<- (Mk/VK+MEANK/VARK) /(1/Vk+1/VARK) ; 
k <- rnorm(1,Mk2,sqrt(Vk2)); 
As <- j/2+1 #inverted gamma 
Bs <- sum((J-k)*2)/2; 
As2<- As + ALPHAs+i; 
Bs2<- Bs + BETAs; 
s <~- 1/sqrt(rgamma(1,As2,Bs2)); 


### find the probabilities of jump and distribution of jump ### 
varjump <- 1/(1/s°2+1/sigma”2/dt) ; 
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meanjump <- ( (dY-mu*dt)/sigma*2/dt + k/s*2 ) *varjump; 
jumpsize <~ jump*(rnorm(n)*sqrt(varjump) + meanjump) ; 


ratio1 <- (1-lambda*dt) /(lambda*dt) ; 

ratio2 <- sqrt((sigma*2*dt+s°2)/(sigma”2*dt)); 

ratio3 <- - (dY~mu*dt)72/sigma*2/2/dt + (dY-mu*dt-k)*2/(sigma*2*dt+s72)/2 ; 
pjumps <- 1/( 1 + ratiol*ratio2*exp(ratio3)); 

jump <- ifelse(runif(n)<pjumps,1,0); 


s <-c(mu,sigma,lambda,k,s); 
sample <-sampletc(jump,s) ; 
result <- c(result, s) 
print(c(s,sum(jump))) 

} 


result <- matrix(result, 5) 

plot(result[1,1:m], type="1", xlab="no. of step", ylab="drift") 
plot(result[2,1:m], type="1", xlab="no. of step", ylab="volatility") 
plot(result[3,1:m], type="1", xlab="no. of step", ylab="intensity") 
plot(result[4,1i:m], type="1", xlab="no. of step", ylab="mean of jump") 
plot(result[5,i:m], type="1", xlab="no. of step", ylab="S.D of jump") 


### the probabilities that time point has jump ##} 
print (sample[1:n] /m) 

### the estimated parameter ### 

print (sample[(n+1) : (n+5)]/m) 

### the true parameter ### 

print (c(MUY,SIGMAY, LAMBDA ,MUJ ,SIGMAJ, sum( JUMP) )) 
### the sample mean & volatility of jump ### 
JS<-JumpSize [JUMP==1] 

print (c(sum(JUMP) ,mean(JS) , sqrt (var(JS)))) 


Resulis and comparisons 

Table 10.2 shows our estimation results. We report the averaged posterior 
means over the 100 sample paths and the variances. It is seen that the esti- 
mates are close to true values and variances are small. Gibbs sampling does 
a good job in estimating parameters for jump-diffusion models. 


Oo 


Example 10.3 shows the usefulness of Gibbs sampling in estimating the 
jump-diffusion model. In practice, this application can be crucial for a risk 
manager to assess how much risks are due to jumps. To examine the jump 
risk empirically, we estimate the impact of jumps on the Dow Jones Industrial 
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Ub a » k s? 


True value 0.08 0.4 3.5 0 0.3 
Mean | 0.0769 0.3986 3.8600 0.0163 0.2868 
Variance | 0.0233 6.5x107> 0.8895 0.0039 0.0015 


Table 10.2 Performance of the Gibbs sampling. 


Index. Our estimation is based on daily closing prices over the period 1995- 
2004. Parameters are estimated on an annual basis. 


Year| ut a BY k s? 


1995 |} 0.2871 0.0901 1.9035 0.0627 0.2608 
1996 | 0.2483 0.1172 2.818 -0.0337 0.235 
1997 | 0.2384 0.1684 3.6587 -0.0256 0.2087 
1998 | 0.1776 0.1752 5.5127 -0.0123 0.1782 
1999 | 0.2177 0.1624 1.7968 -0.0176 0.2627 
2000 | -0.0162 0.1971 3.3364 -0.0235 0.2157 
2001 | 0.015 0.1951 4.1797 -0.0383 0.2008 
2002 | -0.2188 0.2484 2.7072 0.0106 0.239 
2003 | 0.1891 0.1626 2.0479 0.0661 0.2463 
2004 | 0.0351 0.1111 1.7561 0.0004 0.2788 


Table 10.3 Jump-diffusion estimation for Dow Jones. 


In Table 10.3, the number of jumps per year, A, range from 1.75 to 5.5. 
Therefore, we can have 5-6 jumps in a particular year. The impact of jumps is 
significant as almost all s? are bigger than 0.2. The variances o? associated to 
the Brownian motion part of the model are around 0.2, but should be divided 
by 250 to produce a daily variance. When a jump arrives, an extra daily 
variance of 0.2 is added to the index return variance as 07/250 + s?. The 
additional variance due to a jump is a relatively large quantity. Jump risk 
cannot be ignored! This information is useful for risk managers to construct 
scenarios for stress testing. 
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10.5 METROPOLIS-HASTINGS ALGORITHM 


In this section, we explain why random draws using Gibbs sampling approxi- 
mate the posterior distribution. To obtain a general result, we first introduce 
the Metropolis-Hastings algorithm in which the Gibbs sampling is a special 
case. We then show that Metropolis-Hastings algorithm constructs a Markov 
chain with limiting distribution following the posterior distribution. Further 
details are given in Casella and George (1992), Chib and Greenberg (1995), 
and Lee (2004). 

Consider a Markov chain {0} with a finite state space {1,2,--- ,m} and 
transition probabilities pj; Given the transition probabilities, the limiting 
distribution of the chain can be found by solving the following equation: 


m 


mj) = > ™(t)pij. 


7! 


When the state space is continuous, the sum is replaced by an integral (see, 
for example, Bhattacharya and Waymire, 1992). 

In MCMC, we work with a reverse problem. Given a posterior distribution 
m(j), we want to construct a Markov chain whose transition probabilities con- 
verge to the posterior distribution. If the transition probabilities satisfy the 
time reversibility with respect to 1(j), then its limiting distribution is guar- 
anteed to be equal to (7). To explain time reversibility, write the transition 
probabilities p,; as 

Diz = Diy + dig, 
b64=1 and 6, =0 for ij. 


where pj, = 0, pj; = pij for 1 # J, and ri = py- 
If the equation 


T(t) Diy = TI)PF (10.16) 


is satisfied for all 1, then the probabilities pj; are time reversible. This con- 
dition asserts that the probability of starting at i and ending at j when the 
initial probability is given by (i) is the same as that of starting at j and 
ending at i. By simple computation, we check that 


So r(i)piy = S> a(i)py, + Ss m(i)ridi 


= Doriphe + mrs 
= n(j)(1 =) rr; 
= 1(j). 


Therefore, 7(j) is the limiting distribution of the chain. 


182 MARKOV CHAIN MONTE CARLO METHODS 


In other words, a Markov chain whose limiting distribution is the posterior 
distribution can be constructed by finding a time-reversible Markov chain. 
We start this process by specifying transition probabilities q;;. If probabilities 
qij have already satisfied the time reversibility, then the corresponding Markov 
chain is the one we want. Otherwise, suppose that 


T(t) gig > TF) aye. 


Then, it has a higher probability to move from i to 7 than from 7 to 1. 
Therefore, we introduce a probability a;; to reduce the moves from 7 to j7. We 
would like to have 


m(t)qij 045 = 79) G3 (10.17) 
so that 4) 
a ANID Gi 
oe (i) Qij 


Since we do not want to reduce the likelihood of moving from 7 to 7, we set 
aj; = 1. Therefore, the general formula is 


aij = min Bon 7 (10.18) 


From (10.17) and (10.18), we see that the transition probabilities 
Pig = Gj%j, for iF J, 
Pe. SS A= > gigauy, (10.19) 
J 
are time reversible with respect to m(i) and hence define a Markov chain 


whose limiting distribution is the required one. This method is called the 
Metropolis-Hasting algorithm. 


Example 10.4 Consider a random walk Markov chain: 


LA Ee eee) 


All transition probabilities are 0.5, except that the transitions ‘from A to B’ 
and ‘from D to C’ are 1. The transition matrix of the chain is given by 


pa. tee wp 


A/|0O 1 0 0 
B\05 0 05 0 
C}0 05 0 04 
D;|0 0 1 0 


Based on the Metropolis-Hasting algorithm, construct a Markov chain whose 


limiting distribution is (4,4, 4) 4): 
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Simple calculation shows that the limiting distribution of the original Markov 
chain is ( q 3, $ 4). To construct the desired Markov chain, we have to com- 
pute probabilities a,;. For instance, 


sa (A 


QAB= min (1. 


m(A) P(B)A) GG 2" 
This means the transition probability ‘from A to B’ is reduced from 1 to 
1x 4 = 3. For node ‘A’, the remaining transition probabilities correspond 


to the event that no transition occurs. Transition probabilities for the other 
nodes are obtained in the same manner. The final transition matrix becomes: 


|A B Cc D 


A/05 05 0 O 
Bi05 0 05 O 
Clo 05 0 0. 
DD} 0 O 05 0.5 


easy to verify that the limiting distribution of this Markov chain is 
). o 


t is 
1 
74? 


I 
( 


PSST 
ale 


i 
4? 


Let us connect the Gibbs sampling with the Metropolis-Hasting algorithm. 
Suppose we have r parameters, i.e., 9 = (61,---,9,-), in the model. We 
want to generate 9 ~ (0). Let 6”) be the state of 6 at time n and 9+) 
be the state of @ at time n +1. The sequence of vectors {6(} forms a 
Markov chain by virtue of the Gibbs sampling. Therefore, 6(”) jumps to 
6+!) and they differ only in one component. To fix ideas, we assume the 
differed component to be the first component. Since we sample aint) from 
the density function part) ia, ... 64”), the probability density function 
for the candidate Markov chain is given by 


q (on a) = P{art?| but # at, 


To construct a Markov chain whose limiting probability is the target one, 
Metropolis-Hasting algorithm multiplies g by a probability density function 
a, defined in (10.18). As @ is a vector of continuous random variables, the 
definition of a in (10.18) is modified to be 


(n+1)) g (gln)| (n+) 
Gee ga ee apn i 
a(9 l¢ i nin | x (O)) g (Amr Oe) TF 


Consider the following identities: 
P (a+) 2 5p (| girth) 5 1) P Cee — ,aer+0)) 
a(ort) = Plot? e)ez1) P(a,--- 0) 
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=) (ar+)| a) P Ce fas 104") (10.20) 


and 


P (0 A” if 1) P (9f,--- af 
P (9 gt) Ge 1) P Ce a) 
q (o)| gir) Fr. (08, oe ,61")) (10.21) 


P (o™) 
) 


T ‘Ge 


i 


Therefore, we deduce that 


(n+1) (n) 
(rn). pin)\) _ n (0 ) _ 7 (0?) 
a C ePeng es ) ~ q (A+) | a(r)) 4 Co) g(n+i))’ 


which implies that a = 1. Hence, Gibbs sampling is a special case of the 
Metropolis-Hastings algorithm in which every jump is accepted. 


10.6 EXERCISES 
1. Suppose X1,... ,X, are independent observations that follow N(y, 07), 
where y is a known quantity 


(a) Show that the likelihood function L(c?) satisfies 
10?) « (0?)-*/2exp 4-1 S0(X ~ p)? 
2 i=l ‘ 


(b) Suppose further that o? ~ IG(a, 8). What is the conditional dis- 
tribution of 0?|X1,...,Xn? 
Hint: Denote p(¢) as the density of the inverse Gamma distribu- 
tion. Then we have 


Pb) a tte P/F, 


2. A density function with a single parameter, p(z|6), is said to be of 
exponential family if it takes the form 


P(x|9) = 9(z)h(8) exp | t(x)4(8)] . 


Show that normal mean with a known variance, normal variance with a 
know mean, Poisson distribution, and binomial distribution are of the 
exponential family. 


3. Show that if the likelihood function comes from the exponential fam- 
ily and the prior distribution is from the exponential family, then the 
posterior distribution also belongs to the exponential family. 
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. Simulate the daily jump-diffusion VaR of Dow Jones Industrial Index 
based on the data used in Section 5.2.3. Compare your number with 
the GED-VaR defined in Chapter 5. 


. Suppose that z|p ~ Bin(n,p) and pjz ~ Be(x + a,n— xz + 8) where 
n is a Poisson variable of mean . Use the Gibbs sampling to find the 
unconditional distribution of n where \ = 16, a = 2 and @ = 4. 


. Consider the normal distribution with an unknown mean p and a known 
variance. 


(a) Assume that the prior of uw is a discrete mixture of two normal 
densities. Show that this prior is still conjugate. 

(b) Assume that the prior of y is a discrete mixture of k normal den- 
sities. Is the prior still conjugate? 


. Consider the following transition matrix of a Markov chain: 


;1 2 3 4 


1/1/6 0 1/2 1/3 
2} 0 1/3 1/3 1/3 
S130 Aj. 0 178 
4]1/4 1/4 1/4 1/4 


Use the Metropolis-Hastings algorithm to construct a Markov chain 
whose limiting distribution is (1/6,1/6,1/3,1/3) based on the above 
matrix. 


. Consider the transition matrix of another Markov chain: 
jot. ee Se 


12 O, 1/2". 

2/3 0 1/6 1/6 
0 1/3 0 2/3 
1/4 1/4 1/4 1/4 


BmwdN e 


Use the Metropolis-Hastings algorithm to construct a Markov chain 
whose limiting distribution is (1/10, 2/10, 3/10, 4/10) based on the above 
matrix. 


Simulation Techniques in Financial Risk Management 
by Ngai Hang Chan and Hoi Ying Wong 
Copyright © 2006 John Wiley & Sons, Inc. 
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Answers to Selected 
Exercises 


11.1. CHAPTER 1 


? vai, ($2) = 2(L2)'_ (a2) 
= i ' fed)" _p 
ey (zi) 
Var(f) = Var, Ae) 


ll 
3) 3 
Ms 
eo, 
SS 
8 
— 
SS 

i) 

| 
3 | 

oa 


4. (a) my = E(Rs) = B(Sig8) = ESy=8 — [pSat—p)Si)-8 


a a _«¢,)\2 
(b) Var(Rs) — Var( 12) — Vi fot) = pil eS Sa) , 


vs = y/Var(Rs) = Vp(1 py Gaz Se). 


si = Ley. HeD=6 pete pod= 
(c) Similarly, E(Rc) = E(S5") = EC 2 — [Pp +O=p)6. Cc 


(d) Var(Rc) = Var( S52) = Yer) = p=p\Cu— Cay” 
ug = J Var(Rc) = Vp(1 ~ p) ee Gad 
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(e) Right-hand side 
ae Su-S 


Cyu-C, 
pl=p) Ee 


— UC 
= Left-hand side. 


11.2. CHAPTER 2 


1. (a) EX, = o1E(W,-—W:,) —oc2E(Wi, — Wy) 

= 01(0) —o2(0) = 0. 
o?Var(W; — Wi) + (—o2)?Var(W:i, — Wi) 
= of(t — te) + 0%(ti — to). 


Var X, t 


\I 


(b) X, = o1(W, — Wi) — 02(Wi, — Wey) 
o1(W, — Wi,) + (01 — o2)(Wi, — Wi) + 02(Wt, ~ Wey). 
EX, = 0, E(W: —Wi,)+ (01 — o2)E(Wi, — Wi.) 

+ ooE(Wi, — Wr) = 0. 


VarX, = 07?Var(W, — Wi,) + (01 — 02)? Var(Wi, — We) 
+ o3Var(Wi, —W,,) 
= o3(t —t1) + (01 — 02)*(ti — te) + 03(t2 — to). 
(c) EX: = ES) f (We 1 ty-1)(™, — Wi,_,) 
j=l 
a So F(We,_1t7-1)E(M, = W:,_1) = 0. 
j=l 
VarX, = Var S— f(Wi,_,,tj-1)(Wi, — We,_1) 
j=l 
& > f(We,_.,tj-1)?Var(Wi, — W,_,) 
j=l 


= So pMastyayG —t1) 


j=1 
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(d) Let O=to<ty <to<---<t, =t, 


E If f(W,,7) aw,| 


_ lig 
-~£ Es = DFM sate M = Hi) 
= 
= a eae ~ Wi;_3) 
3 


= jim IM, 1 j-1)(0) = 


e[[ sone] 
ze 
oo 


n 
= im S378 f(We,_4,t:-1) (Wi, she Wi.) f(We;_15t3-1) (4, 7 W,,_1)] : 
i=l j=l 


2 
nD (Wi; ptj- 1 \W; - Mia) 
nr n 


2 (Wi, ri 1 (Wi, — Wi.) f (We;_1,t3-1) (WM, ~ W,,_,) 


Suppose i < j, 

E [f(We,_ 1 tea) F (We,_1stj-1) (We, -_ Wi, ) (Wi, ~ W.;_1)| 
E (F(We,_1 tea) f (Wes. ty-1) (Wi; i Wi-1)| EW, a Wi,_1) 
= E [Ff (We;_1, tsa) F(We,_1,t3-1) (Wi, = Wi._1)] (0) = 0. 


Similarly, for 1 > 7, the expectation is also zero. 


t 2 n 
B| f f(W, 7) aw, = lim SOE [f(We,_1+t)-1)?(We, — Wej1)?] 


n—0o0 4 
= Jim DO ES(W4,15t)17E(M, —Wi,_,)’ 
j=l 


= tim SUES (We, t5-1) (ty — ty-1) 


j=1 


t 
= | Ef(W,,7)2 dr. 
0 
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2. (a) Apply Ité’s lemma, 


2 L081 ds os 1, 0S 
dS; = Gis als ales 5) eG Jax ww 


1 

= as d+ 5saw 
5 1 

= -— =S dW. 
5yodt + 58d 


(c) As X; tends to negative infinity, S; should tend to zero. 


(e) Sto90 + Be*1000 = e792 +4(4)7 1000 _ ,- 938 
3. Apply Itd’s lemma, 
0G 1 OG OG aG 
= bS 2 peck bso dw 
dG, (as3o + at eet) a+ mgd 


= (as5s-4 + 5(68)(-G)s-4 f 0) dt + bs(587}) aw 


Ln 52S 1 
(50- ;b°)Gdt + sbaaw. 


4. (a) Apply Itd’ s lemma, 
dlog S = (0.1 - 503”) dt + 0.3 dW. 


Since log S(0) = 0, we have, 


log S(t) = 0.055t + 0.3VtZ, where Z~N(0,1), 
ae: ne . 0.32 
p= fim + log S(t) = Jim 7 (0.055t+0.3vtZ) = 0.055+ lim. a 0.055. 


(b) The limit does not exist. 
lim 1 hog S(t)—p et}? = lim £ (0.3ViZ)? = lim (0.092?) = 0.0927. 
too t too t tc 


5. (a) The simulation result should match It6’s lemma. 


(b) The difference between It6’s integral and the Stratonovich integral 
tol 
1s 3 


6. (a) OY _ ow @Y ow 
ow °° aw? 


dY = se dt +e” dW = aY dt+¥ aw. 


CHAPTER 2 
(b) By (a) 
t ty t 
dY = [seas | e dw. 
0 0 2 0 
t ty 
[ vaw = %-¥- | =eW> dr 
0 0 2 
J € 
= ma1-5 f e> dr. 
2 0 
(a) dX = pdt+odW. 
df = (X+ypt)dt+otdW. 
dg = (X?42utX + 07t) dt + 20tX dW. 


(b) Let O=to < ti <te<:--<t, =t, 


Lif le 
A=5f Rag ee 


As all X;, follow normal, A; also follows normal. 


EA; 


Var A; 


I 


DS See er 
jim, 5 UF min(t;, t;) 


i=1 j=l 


1 t t 
ig = i | min(p, 7) dpdr 
t Jo Jo 


pit 
t? 3 
o*t 


ae 
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eS 
(c) Cov(X;, At) = fat ee Cov( Xz, Xz; ) 
1 nr 
es : ae, 22. 
= gl 


1 t 
= a f T aT 
t Jo 


ot 


7 


11.3. CHAPTER 3 


1. (a) Long one unit of call option and short x units of stock. 
When the share price goes up: (75 — 65)* — 75x = 10 — 75z. 


Otherwise, the share price goes down: (50 — 65)+ — 50a2— = —50z. 
Therefore a riskless portfolio is found by, 10 — 75% = —50z, x= 
0.4. 


(b) Substitute « = 0.4. We consider the following cases. 
When the share price goes up: (75 — 65)+ — 75(0.4) = —20. Oth- 
erwise, the share price goes down: (50 — 65)+ — 50(0.4) = —20. 
The value of portfolio at maturity = —20. 


(c) Expected value = 0.7(75 — 65)* + 0.3(50 ~ 65)+ = 7. 


(d) The discount factor = FEO ASNT) = 8/9. 
Reasonable price = 7(8/9) = 6.222. 


2. (a) cp = eT E|[payoff] 
eo pens + 2p(1 — p)ecua + (1 - p)*caal- 


(b) We prove this by induction. By (a), the statement 


n 
Cy =e7"t S- {nC (1 — q)"~) max (Suid”~4 — K, 0)} 
j=0 
is true for n = 2. Suppose this is true for n = k and all T, ie., 


k 
Ch =e Tt > nCjq@ (1 — q)”~Imax(Suld"~J — K,0). 
j=0 


For n = k +1, by the binomial assumption, 
e3 d 
ceer =e" acy” + (1 — acy”), 


where, fe and fe) are the call price that the stock price goes up 
and goes down respectively, i.e., 
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k 
of) = eM P-OS C599 (1 — g)* Imax((Su)uid*? — K,0), 
j=0 


k 
of = e-T(T-?) J Cya"(1 — q)*-Jmax((Sd)uid*~9 — K,0). 
j=0 


cer = € 77) ) eOjq'* (1 — 9) Fmax(Su? a? — K, 0) 
j=0 
k 
+evTy > nC; (1 - q)*tD)~I max(Suld *t)-9 — k,0) 
j=0 
kti 
= ail 1S Cig (1 = ght Jmax(Suid\**+)-9 — K,0) 


j=l 
k 
+e TS” Cig (1 — 9) 4 I max(Suid**Y-7 — K, 0) 


k 
= 6 TPS" (nCj—1 +e Cy)a? (1 — g)*#? ~7max( Suid 49-7 — K, 0) 


j=l 
te? g*tmax(Su**? — K,0)+e7"7 (1 — qg)***max(Sd*** — K,0) 
k41 
ee a a K+1C;q (1- g)*t)-Imax(Suid*t)—F — K,0). 
j=0 


This statement is also true for n = k + 1, the proof is completed. 


(c) Using normal approximation, 


(x — nq)? 


1 
EXP § — >), a1 > 2 
/ 2nrq(1 — @) { 2ng(1 ~— 7} 


nCoq (1—q)"~* = 


lim cy = lim | e- YS nCi@ (1 — q)"-Jmax(Suid"~4 — K,0) 


noo na 
j=0 


(j-ng)? amall 8 Seo V5t(25-n) a K)*. 


-—7rT 
lim _ 
a roe ae ee erie 


Let Z; = 


aca j => 0,1,---,n, and 62; = Z; = Zj-1 = 


—_—_L—, we have, 
V¥na(i-q) 


Ze 
lim c, = e777 lim ee a Tax® xp{——s }(Ser™ —k)*, 


m—-0o nC 
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where Xn = V6t[2Z;,/nq(1 — q) + n(2q — 1)). 


lim X, = lim V6t[2Z;\/ng(1 —q) + n(2¢- 1] 
= lim [2vn6tZ; Vg — 4 )+nv6t — 
= TZ, lim Vq(1—4 y+T tim * is Bethea 
a a 
Recall that g = lial u =e? e~ ov ot , we have, 


ert rt arvVdte™t + od 


lim q= lim = lim = 1/2: 
qaeet Jsi-0 u-d Vi-0 out+od / 
fe ge .. ead ; 4rV/d5te™* — ou+od 
lim = kin ——— = ln 
noo y/§t Vso Vét(u — d) Vét—0 (u — d) + Vét(ou + od) 
as ites (8r75t + 4r)e™! — o2u — od _ 4r- 207 
Vio (ou +d) + (cutod)+Vét(o2u—02d) 40 
Hence, limp oo Xn = VTZ; + T(r/o — 0/2). 
lim cz 
noo 
n 
62; Z* 2 
= —rT Jj ne eee (r—o?/2)T+oVTZ; _ + 
Jim e » ae 5 \(Se 3—K) 
ee a [. LF (gelt-o7 DT +ovTZ = K)t dZ 
-co V2T 


ae evlE [max(Se"- o?/2)T+oVTZ _ K, 0)| 


call price in Black—Scholes. 


3. Left-hand side — Right-hand side 

p+S-ce—-Ke? 

= eTE(K — Sr)t*]4+S—e7"E[(Sp — K)t] — Ke“? 
e TTEI(K — Sr)* — (Sr —K)*] + 5-— Ke 
eTTE[K — Sr] +S — Ke"? 

= eK -—e-TE[S7]+5-Ke? 

€ 

0. 


PK 8 4 8. Ke tf 


4, We know that log Gr ~ N(log(So) + (r — 0?/2)T/2, 0?T/3). By Lemma 
3.1, 


price =e"? [E(Gr)®(d,) — K®(d2)}, 
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where 


E(Gr) = Sp exp{(r — 02/2)T/2 + 50° /3} = So exp{(r — 02/6)T/2}, 


log(So/K) + (r —0?/2)T/2+0°T/3 _ log(So/K) + (r +.0?/6)T/2 


_ oyti/s 7 oV/T/3 ; 
sy = L2elSo/K) + (r= 0/2) /2 
aV/T/3 


5. Let X, be astochastic process that follows the dynamic dX = p(T, X)d7r+ 
a(t, X)dW, X,; =x. Applying Ité’s lemma, 


La X)ae of dt + o(t, xf dW. 


As $4 ae + p(t,x) 2£ + a(t,2)f = 0, 


df = —a(t,X)fdt+o(t, x)ee dw 
df+a(t,X)fdt = ott, x) oF aw 
ate —a(r,X) dr difed a(nX) dr py a att, x)ob aw 


T T T T 
/ dled. a(n) dr gy / Ps arXaroy x) OF ay. 
: ax 


t 


The right-hand side is a Gaussian process, then the expectation is zero, 


T t 
B lel air) dr 507, Xp) — elk "oF, X9| = 0 


5 [ef aia a 7 a Xn) f(t, x), 


where f(T, Xr) = F(Xr) and dX = p(t, X) dr+o(7, X) dW, X; =z. 
6. (a) Follow the proof of Theorem 3.1, and substitute r and ao by r(t) 
and o(t) respectively. 


(b) Substitute a(t,r) = —r(t), u(t,x) = r(t)S:, o(t,2) = o(t)S:, and 
F (Xr) = F(Sr) = max(Sr — K,0) into Problem 5. 


(c) Let $4 be a stochastic process following the dynamic dS). = 7S, dr+ 
aS, dW, S, = Si = S. By virtue of Itd’s lemma, we have, 
dlogS, = {f@)}= o?(r)/2) dr +o(r) dW,, 
dlogSi, = (r—6?/2) dr+adW,. 
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Both log Sy and log Sf are normal, and 


T 
Ellog Sr] = Eflog S;] +f (r(r) — 0?(r)/2) dr 
Ejlog Si] + (T—t)F = Eflog SF}, 


i 


T 
Varllog Sr] = i o?(r) dr = (T — t)a* = Varllog S7]. 
t 


Therefore, log Sr and log S4 share the same distribution and Sp 
and Si‘ have the same distribution. 


fas) = eb 7 Bpmax(Sp — K,0)] 
=e iM r() rE max(Sip — K,0)| 
= Cps(t,S:|\r =F,0 = 6G). 

(a) dS = rSdt+oSdw. 


dlogS = (r—o?/2)dt+odW. 


log St N(log S, + (r — o7/2)(t — rT), 07(t — 7)). 


2 


E?(X(t)|X(s),8 <7) 
= e EP(S(t)|X(s),s <7) 
= e "exp{log S, + (r— a7 /2)(t —T)+ sot —T)} 
= S,e 
= X(r). 
(b) E(C(t, S;;T)e"'F—-9|X(s),s <7) 
= E(E((Sr — K)*|X(s), s < t)|X(s),s <7) 
= KE((Sr—K)t|X(s),s<7), (7 <t) 
= C(r,S;; Tye -2), 
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ia Hlft 1 1 
Bese) d= gaa EG) ee 


Generate Y ~ U(0,1), then X = F7'H(Y) = Fes 7| +1 has the 
desired distribution. 


(b) Since an analytical formula of the c.d.f is not readily available, 
we resort to an alternative way. First compute a sufficiently long 


CHAPTER 4 197 


sequence {fi}ixo,1,.-, where fo = 0, fi = F(i) = P(X <i) = 
Ee 9 erg pC = “pip”, and set the last element to be 1. Then 
generate y ~ U(0,1), and search the interval [f;, f:+1) in which y 
lies. Next set X = 17, and X has the desired distribution. 


N 
— 
@ 
~~ 


= [10U(1 — U)], as U € [0,1], then 10U(1 — U) € {0, 2.5]. 
If X =0,U = {0,1}, 


if xX =1,U = {4 -—,/3,34+,/3}, 
ifX =2,U ={4-,/4,44+,/5}. 


P(X =0) = PQOU(I-U) <1) = (5 - (8) -0]+[1-G + J) 
~[-/B-G- 3] + [G+ ae 4+ ,/3)] 

A 

P(X =2) = P(1OU(L-U) > 2) = (4+ f&)-G- /d) = V3 


(b) Y = [1/U], as U € [0,1], then 1/U € [1, 00). 


1 1 1 
P(X=k) = P(k<= = P(-=> ne 
( ) (KS <k+1) (,2U>— 4) 
1 1 1 
~ ik k+1 k(k+1)’ al 
(c) Z = (B-—3)?, Z = {0,1,4,9}, 
P(Z =0). =P(B=3) = 5C30.5° = 10/32, 
P(Z=1) =P(B=2o0r4) = 5C20.55+ 5C,0.5° = 15/32, 
P(Z=4) =P(B=1or5) = 5C)0.55°+ 5C;0.55 = 6/32, 
PZ =9) =P =0) = 5Cp0.5° = 1/32. 


3. (a) As the functions in (a), (b) are monotonic, 


fx(z) = lez fu(U) = (pr se 


fx(z) = |(2m sec?(nU))~*| fu(U) = 
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(c) Suppose W € [w,w + A) and nU € [k,k +1), k =0,1,---,n—-1, 
then W = nU — [nU] = nU —k € [w,w + A), U © (Mee, MEEt4), 


jim, arw € [w,w + A)) 


eee wtk+A 
= l ee 
fa, 5 5? (ve! ae n )) 


fw(w) 


I} 


PW swt =i) =P(WswU[,*4)) =p(velt,“*)) 


Therefore, P(W <w,I =i) = ¥% =(w)i = P(W < w)P(I =i). 


4, (a) Generate U ~ U(0,1), and accept X = [7U] + 1 with probability 
P(X)/P(X = 6) = P(X)/0.17. 


(c) Probability of acceptance = yi Th. = ep: 


T 
Expected number of acceptance = 1000545 5 = 840. 


7 x 
5. (a) Thec.d.f. F(x) = $2, 


X=FO(U)= yw +5- * where U ~ U(0,1). 


(b) Generate U ~ U(0,1), and accept X = U with probability 210 | 
Probability of acceptance = i, Bix) dx = 2 


6. (a) c= max {a} = Fez max(e* P42) a = Jezmax(e'— re te € 


1/2. 


(b) Generate 2 ~ g(x), and accept x with probability WET ad ot 


11.5 CHAPTER 5 


4 E(X,) = BZ %) = (2 EX) = (= 8) = 8 
i=l t=1 t=1 
Var(X,) = vane Xi) = (a > VarXi) - (- 2) = 2 
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5. E(S?) 


— Xn)?) 


,- 6) —(X-0)P 


= ate LI - 6)? — 2(X; - O\(X — 6) + (X _ 6)?] 


= qaTB LO - 97 ED — 6)(X —0) +BY (X — 6) 


i=1 


= iy E(X; — 0)? — 2En(X — 6)? + nE(X — 6)7] 


2 
= : Ino? — one + n—] 
n-1 n n 
See 
i =) 
1 j+l _ j 
= FAQL% ~ Xs)? - 06 - HI 


= <[S U(X — Xj41)? — (Xi — X57] + ae — Xj41)? 


1 J 
as es — X} — 2(X541 — X5)Xi] + 5 (Xja1 - Ay 
= Xp. — X} - 2(Xja1 — X)X5 +5 Xn ~ X541)? 
= X3,. — 2X 541%) + XP + sLi(K ~ X;)P 
= (J +1)(Xj41 — 5)’. 


7. (a) Gamma measures the curvature of the portfolio. 
(b) Let S = e¥. Then 


li 


eC 
as2 $(z| log S) Dev Faue cl) 


= sh aa x1) 
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y-~vT 
o?T 


2 

= 5 [vow RE tee 
xz—y-—vTy 

= HEN fey vy BEM] 


2 
- BL awe] 


+ e Olay) Se 


fone) er 
—rT x 
gamma = e ie Fie ) aga Pla! log S) dx 


2a i p(s) Sev) - -owWr + ME 1] di 


S2o2T 
F(S) |W? 
evtTE | ( ) T 


Hop eT 


~ oWr -1)]. 


(c) Generate Wr ~ N(0,T), let Sp = Spe™-?"/2)+2Wr and F(Sr) = 
(Sp — K)+, then take the expectation given by (b). 


11.6 CHAPTER 6 


0, y<a, 
L Fy)=PY Sy)=PUS ED) =4 fr 1de= 2, asy<d, 
1, y 2b. 


pa asy<b, 


2: Var(X ~ bY) 
= Var(X) — 2bCov(X,Y) + 6?Var(Y) 
= Var(X) — 2b[Cov(a, Y) + bCov(Y, Y) + Cov(e, Y)| + 6?Var(Y) 
= Var(X) — 6?Var(Y). 


3. As f(u) = e*” is monotone on [0,1], therefore eY” and e@-¥" are 
negative correlated, and 


Var [e”” + ca — Var [ev + v3] = 2Cov eee < 0. 


4, (a) 6= 1572, 4X3. 
(b) 6 = mn PASAY EX) 
(c) 6= TL 4 (A4)). 
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(d) Let control variate be x, Elz] = 1/2, 
6=42 >, 423 - (bx — 1/2), 
where 6 is the least square estimate of 422 =a+br+e. 

5. (a) f(X) = (X -2)*. 

(b) It is known that random variable X — 2 conditional on {X > 2} 
follows Exp(1), hence generate X = —log(U) + 2, where U ~ 
U(0, 1). 

(c) Generate X, = — log(Z¢*) +2, k=0,1,2,3. 

(d) E(X —2) = -1, 


(X -2)by = (K ~2)+ -b((X —2)—mx~2)) = (X-2)+ - 6X +1). 


6. The theoretical price = SpN(d;) — Ke~"? N(d2) = 6.8050. 
id ESp = Soe’TE(e? VT) = Spe’? (e” 7/2) = Soe", 
Var|[Sr] = S2e?T Var(e?¥T7) = Seer ee Te T 1) = S2eT (eT 1). 
8. Generate Z; ~ N(0, 1) and let 
s = Spelt 0 /2T+ev TZ: andl 5) = Spe 2 /IT-ev TZ: 
n i oi 
B(h(Sr)) = > AEH E MEH) 


i=1 


11.7. CHAPTER 7 


1. By Section 7.4, we know that Gr = exp [# fr log S(t) at], and 


log(Gr) ~N (+ log S + log S_+ 
t 


(r —o?/2)(T —t)? o?(T — t)? 
; Page alga) 


2T "37? 
Applying Lemma 3.1, we have, 

e(T-VE(Gp — K)* =e" F VEGr)B(d1) — 7" F  K O(a), 
where 


eT-ORGL 


2 2 2 3 
—r(T-t) t G; (T = t) TO /2 lo (T = t) 
e exp { 7 log = + log S: + Tagg ae + Sa 


+ (T-t)? (r—0?/2\ . 10°(T -t)? 
we —r(T-t) Scone! eh Nore 
S, yre — { T 2 )12 3P 


i 
wh 
o—™ 
Q 
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dq = (“Sa ) (ioe F + Fhe 1 Co 


3T? Kp ee QT 37? 
(feenaty (Tog + to $ os hor Lyn t)? +o0-""), 
de eld aa ti oe 
6.  e TE(Sr—S,)* = e Be 7 WE((Sp — S1,)*1St)) 


ee E(Cgs(St, ; ty; St, 7 T)). 


Hence, we first simulate prices S;,, and compute the corresponding call 
option prices by Black-Scholes. Then the average would be the simulated 
price. 


11.8 CHAPTER 8 


1. Applying Taylor’s series expansion, we have, 
ft+A,x+Az,y+ Ay) — flt,z,y) 
1 1 
= frA+ feds + fy Ay t+ 5 fee At + fay Mody + af Ay + o({A), 


ap an { A, =a(t,c)A+ b(t,xz) AW, 
Ay =a(t,y)A+ B(t,y) AWe. 
Taking A — 0, we have, 
de = a(t,r)dt+ b(t, x) dW, 
dx* = D(t,x)* dt, 
dy = a(t,y)dt + B(t,y)dWa, 
dy? = i(t,y)? de, 
dzdy = O(t,x)G(t,y) jim, E(AW, AW?) 


(t 
= b(t, 2) A(t, y)E(dW, dW2) = b(t, x) A(t, y)p dt, 
ofA) — 0. 


Substitute these into the above equation, 
df firdt + fr(adt +bdW;) + fy{adt + BdW2) 
1 1 
~ 5 fax dt + pbB fry dt + 5 fy dt 


b2 BP 
(4 + af, + afy + 7 fee + po fey a 5 hn) dt 
+ bf, dW, + Bfy dW. 
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2. (a) Using the result of Problem 1, we have, 


dX = 


1 Si 1 4 2 2 52 Si 
Sis —- = ge ope tat 2 
(+ 1s 752 po 1025;S2 get 9 “eho ) at dt 


tose dW - 20805 dW 
Si 
ei(= poj,09g + o3) dt + 2a, dW, — 02 dW2) 


(a; (dW, — po2 dt) — o2(dW2 — 02 dt)) 


So 
Si 
So 
X(o1 dwt — 02 dW3). 


The process X is a martingale under Brown motions W/(t) and 
W(t), as there is no drift term. 


= E(o,dW, — o2dW2)? 


= o7E(dW)? — 20,02E(dW, dW2) + o3E(dW2)? 
= (0? —2p0,02 + 02) dt 
= odt. 


Therefore X(t) follows a geometric Brownian motion W* with 
volatility 0. 


(c) By the result of Exercise 3.7, dX = oX dW* has a zero drift under 
W*, therefore U(t, X) is a martingale under Brownian motions W/ 


and W 3. 


3. Vow 


ie S1(t) 

= S(HHU(t, = 0 

- ses 33 

= SQ(t)E (sa (ry 51, 59)) 


1 
= Si(0E (5 Sry OulP) - si(r))*) 


= 5(t)E (( a )") 


in Silt) arse) ade 
= salt) (FB oa) ~ 0(45)) 


= $(t)®(dj) — S2(t)®(d5), 
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where 
log S42 + 92(T — t)/2 
ie ce“ a) , & =dt —oVT-1, 
oVT —t 
oe = a — 2paj02 + o2. 
5. (a) We know that arithmetic average is greater than geometric average, 


therefore, 


ey 1/n 
(11 sn) 
i=1 


SA le «| 


Serres «| 


1 Tr 
< po) 
+ 12 + 
< (23> s0)-x) 
+ i + 
< (23 s-x] . 
i=1 


(b) Suppose dS; = rS;, dt + o,S;dW;,, with E(dW,dW;) = pi; dt, for 


i,j =1,2,---,n. Then 


dlog S; 


aT n 
= > dlog S; 
an t=1 


2 


where o 


(r — 0? /2) dt + o, dW; 
2 
(r— Di atlatt+— laid, 
i=l i=l 
a n 
~~) 4 
(r ane ee 


n 
1 
ne ) PijiO5. 


ij=l 


Let G(t) = []/_, S:(t)!/”, then we have, 
e-"TE (G(T) — K)* = eT E(G(T))®(dy) — Ke~"? ®(da), 
where 
1 nr 
E(G(T)) = exp {a +(r- on Soo? \(T o(T — o} : 
= 
a = Dek + GU) + - pn OF )(L — t) + 07 (T —t) 
: Pat ~t) 
_ log $M +(r+o0?— <>, 0?)(T -t) 
= oVT -t i 
dg = di —ovT-t. 
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11.9 CHAPTER 9 


1. From Section 9.3, we have, 


B(0O,T) = lim E {oo (-ae3>») , where 
i=1 


n 


_ (1 ~ aAt) — (1 - aAt)"*} “, 1-(1-aAt)* 
Yon (ro—b) Ai tontoVAt ) eng —~ 


t=1 i=1 


b) (1 - aAt) — (1-aAt)"*? | 


a 


lim E[-At Sor] = lim ~(ro oT 


i=l 


: — _} = n+l 
limn—oo(1 — a@At) — limn—oo(1 — aAt) oT 


= = (ro — b) 


t 1—(1- At)" 
[ — (1 - aA‘) 


I 
F 


Jim, Vel-At ed Jim 


a2 
21—(1-aAt)" 
+(1—-adt) T-(i-aAt? 
| saan . ,l—aAt : P 
3 [im Atn lim 2-——(1 - jim (1 ~ aAt)”) 
_ At(1 — aAt)? ; oe 
Sane f= ekei rae 


o 


= 3 jar - 2(1 — e°7) + (1 = e*)| . 


B(O,T) = _ lim exp 


NR-+0O 


n 1 nr 

E[-At Sori + 5 Varl[-At Sn 
i=1 i=1 

= exp | lim E[-At > ri] +5 lim Var[—At > rij 
ime rem n—-0o 4 


c _ p-aT 
= exp}|—bT — (ro ae © 


lo? -aT, , | -oT 
p72 ler ai +A(1— 
a 2a? la ae gt 3 | 


[ T= eo? a -aT ~aT 
= exp —OP— 00 > 0) 7 lak tte — 2e7*" —3]]. 
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2. (a) After disceretization, we have, 
Tig. = 74 + 0.1(0.05 — r;) At +0.3/r; Ate;, fori = 1,2,--- 


Using Monle Carlo, we have sample paths of interest rate. For each 
scenario, we compute exp{— 5-7; At}, and the average will be the 
simulated value of discount factor. 


(b) After disceretization, we have, 


(= — rev +0.1(0.05 — rev) At+ovasicekV At €;, for i = 1,2,- 
where rfV = 0.052 and ovasicek = OCIRVT0 = 0.3V0.052. Com- 
pute roY and exp{—)>r¢V At} to be the control variate. Using 
the least squares estimation, we obtain the control variate expec- 
tation. 


3. (a) Simulate scenarios of interest rate r, where 
Ti41 = Tit (at et) At +ovAte;, fori =1,2,--- 


Take B(0,t) to be the average of exp{— > r,At} across different 
scenarios. 


c on = — bt; 
(¢) T%—To = Jin, Dlla+e ) At +c AW] 


t 
i, +e arto [ dw 
0 0 


er t 
zs lar — 7 I +a(W, — Wo) 


tT —e-% 
= at+——— b +eovt. 


(d) Similar to Exercise 1, we have the formula of B(0,7). 


T 
[rear = [eters 7 ” + We) ar 
0 


aT? T 
moD + +54 4 f° W, dr, 


which is a Gaussian process. 
T 
aT 
B|- fori] = mee La [Wea 
t 


2 »b b2 


T ip 2-73 
Var -[ r,dt\ = Var -f W,dr| = gat 7 
t 0 3 
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T l T 
B(O,T) = exp, -| r, dr +5 Var -| r, ar 
t t 
eT? T e T_1 gf 
SP ape garg 
11.10 CHAPTER 10 
nr 
1 (Xi-n)? 
1. (a) L(o”) ~ ll ee Ae 


The posterior is IG(a + 8, 8+ 5 oy (%i — w)”) - 
2. (a) Normal with know variance: 


1 _ (r=)? 1 
2 = 


e€ 20 —— 
V 210 V 210 


(b) Normal with know mean: 


aif igi Ne 1 
€ (t-B)" 3o2 


(c) Poisson: 


(d) Binomial: 
“wow Spy) = ,C,e* logp + (n—a) log(1—p) 


\. 
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3. Suppose that, 


L(8) = gu(z)hu(B)exp |S tr.s(2)¥z.0()| 


P(8) = gp(z)hp Bex |S tyale)ya(9)] 
Then, 
(8) x gx (z)hz(B)exp [5 tz,s(z)¥1,0(6)] 9p(a)hp (exp [J tp.i(e)¥p.(6)| 
% (9x ()gp(2))(h (8)hp(B))exp |S tr.s(z)bx.4 + talz)¥pa(9)] - 


= (w= ng)? z= (wang)? acai 
6. (a) Let pl)xAe 71 +Be 771 and L(p) « Ce~ “ae? ‘ 


Ce” 2? | Ae 71 4 Be 1 


(r—n)?2 = (why )? a (ung)? 
mw) «x 2 


(n= 43)? (n— 
— =n)" ee _ (x=y)? ee 
= ACe™ 2? e *: + BCe 20? a 


Hence, this is a mixture of two normals. 


aye! 
_ fe #3) eaip)e 


(b) Let p(y) « pie Aje 7% and = L() « Cee 


K 
_ ian? = 
mu) x Ce 2? S Aje 7% 


Hence, this is a mixture of normals. 
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: Pali 
7. Evaluating a;,; = min {1, coe i we have, 


aj| 1 2 3 4 


i tae a5. 10 4 
i a ae 

$4 Se 1/8. cou 172 
a ye ae) ae eee 


As P(t, 7) = p(t, j)aij, for allt #4 j, we have 


eh OOP 13 

i OS 9/3 28 
= 0 1/6 -- 1/4 
1/6 1/6 1/4 —- 


2/3 0 O. 1/3 
0 1/3 1/3 1/3 
0 1/6 7/12 1/4 

1/6 1/6 1/4 5/12 


I 


The transition matrix can be verified by checking the equation 7P = 7. 


8. Similarity, 
7 ie a a 


a a 
PA uOS an I 
Seas BT es Aye 
a0: 1/3) 2. 


1 0 0 0 
0 2/3 1/6 1/6 
0 1/9 5/9 1/3 
0 1/12 1/4 2/3 


Although the matrix P satisfies the equation #P = 7, the Markov chain 
is reducible, it is not suitable for MCMC. This exercise shows that not 
all Markov chain can be reduced to a desired chain via the Metropolis- 
Hastings algorithm. But for most common continuous state Markov 
chains, the Metropolis-Hastings algorithm still works. 


P= 


ry 


or 


[op) 


Simulation Techniques in Financial Risk Management 
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